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Preface 



The conference series Logical Aspects of Computational Linguistics (LACL) aims 
at providing a forum for the presentation and discussion of current research in all 
the formal and logical aspects of computational linguistics. The LACL initiative 
started with a workshop held in Nancy (France) in 1995. Selected papers from 
this event have appeared as a special issue of the Journal of Logic Language 
and Lnformation, Volume 7(4), 1998. In 1996, LACL shifted to the format of an 
international conference. LACL’96 and ’97 were both held in Nancy (France). 
The proceedings appeared as volumes 1328 and 1582 of the Springer Lecture 
Notes in Artificial Intelligence. 

This volume contains selected papers of the third international conference 
on Logical Aspects of Computational Linguistics (LACL’98), held in Grenoble, 
France, from December 14 to 16, 1998. The conference was organized by the Uni- 
versity Pierre Mendes-France (Grenoble 2) together with LORIA (Laboratoire 
Lorrain d’Informatique et Applications, Nancy). On the basis of 33 submitted 
4-page abstracts, the Program Committee selected 19 contributions for presen- 
tation. In addition to the selected papers, the program featured three invited 
talks, by Maarten de Rijke (ILLC, Amsterdam), Makoto Kanazawa (Chiba Uni- 
versity, Japan), and Fernando Pereira (AT&T Labs). After the conference, the 
contributors were invited to submit a full paper for the conference proceedings. 

I thank the members of the Program Committee for their conscientious work 
in refereeing the submitted abstracts and the full papers, and the authors of the 
invited papers and the regular presentations for their stimulating contributions 
to the conference. Special thanks go to the Organizing Committee, in particular 
Alain Lecomte, whose efficiency contributed greatly to the success of LACL’98. 
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Deductions with Meaning 



Christof Monz and Maarten de Rijke 

Institute for Logic, Language and Computation (ILLC) 
University of Amsterdam, Plantage Muidergracht 24, 
1018 TV Amsterdam, the Netherlands, 
{christof, mdr}@wins .uva.nl 



Abstract. In this paper, we consider some of the problems that arise if 
automated reasoning methods are applied to natural language semantics. 
It turns that out that the problem of ambiguity has a strong impact on 
the feasibility of any theorem prover for computational semantics. We 
briefly investigate the different aspects of ambiguity and review some of 
the solutions that have been proposed to tackle this problem. 



1 Introduction 

One of the concluding slogans of the FraCaS project on Frameworks for Com- 
putational Semantics is that ‘[t]here can be no semantics without logic’ We 
take this to mean that formalisms for semantic representation should be devel- 
oped hand-in-hand with inference methods for performing reasoning tasks with 
representations and algorithms for representation construction. 

Clearly, to be usable in the first place, representation formalisms need to come 
equipped with construction methods, and this explains the need for algorithmic 
tools. But what about the need for inference methods? At least three types of 
reasons can be identified. For cognitive purposes one may want to test the truth 
conditions of a representation against (a model of) speakers’ intuitions — this 
amounts to a model checking or theorem proving task. Also, the whole issue 
of what it is to understand a discourse may be phrased as a model generation 
task. Computationally, we need various reasoning tasks and Al-heuristics to help 
resolve quantifier scope ambiguity, or to resolve anaphoric relations in informa- 
tion extraction and natural language queries. And last, but not least, the very 
construction of semantic representations may require inference tools to be used 
in checking for consistency and informativity. At the end of the day, the main 
purpose of a semantic representation is that we can do something with it, both 
algorithmically and in terms of inference tasks. 

Now, the present times are exciting ones for anyone with an interest in infer- 
ence for natural language semantics. On the one hand, there is work in semantics 
that has little or no attention for inferential aspects. This is certainly the case 
for a lot of work in dynamic semantics and underspecified representation, and in 
the recent Handbook of Logic and Language jO] inferential methods for semantic 
representations are largely absent, despite the fact that a substantial part of the 
book is devoted to representational matters. 
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At the same time, there is a growing body of work aimed at developing 
inference methods and tools for natural language semantics, fed by a growing 
realization that these are ‘the heart of the enterprise’ |71 page viii]. This is 
manifested not only by various research initiatives (see below), but also by the 
fact that a number of textbooks and monographs on natural language semantics 
and its inferential and algorithmic aspects are in preparation ECU, and by a 
recent initiative to set up a special interest group on Computational Semantics 
(see http://www.coli.uni-sb.de/~patrick/SIGICS.html for details). 

In this note we survey some of the ongoing work on inference and natural 
language semantics; we identify commonalities, as well as possibilities and the 
main logical challenges we are confronted with in the field. 



2 Putting Semantics to Work 

2.1 Lines of Attack 

It has often been claimed that classical reasoning based on first-order logic (FOL) 
is not appropriate as an inference method for natural language semantics. We 
are pragmatic in this matter: try to stick to existing formats and tools and see 
how far they get you, and only if they fail, one should develop novel formats and 
tools. Traditional inference tools (such as theorem provers and model builders) 
are reaching new levels of sophistication, and they are now widely and easily 
available. Blackburn and Bos [Zj show that the ‘conservative’ strategy of using 
first-order tools can actually achieve a lot. In particular, they use first-order 
theorem proving techniques for implementing van der Sandt’s approach to pre- 
supposition. We refer the reader to the Doris system, which is accessible on the 
internet at http://www.coli.uni-sb.de/~bos/atp/doris.html. 

Although one may want to stick to first-order based tools as much as possible, 
for reasons of efficiency, or simply to get ‘natural representations’ it may pay to 
move away from the traditional first-order realm. Such a move may be particu- 
larly appropriate in two of the areas that currently pose the biggest challenges 
for computational semantics: ambiguity and dynamics d Chapter 8]. Let us 
consider some samples of deductive approaches in each of these two areas. 



Reasoning with Quantifier Ambiguity. While the problem of ambiguity 
and underspecification has recently enjoyed a considerable increase in attention 
from computational linguists and computer scientists, the focus has mostly been 
on semantic aspects, and ‘reasoning with ambiguous sentences is still in its in- 
fancy ’ [E|. Lexical ambiguities can be represented pretty straightforwardly by 
putting the different readings into a disjunction. It is also possible to express 
quantificational ambiguities by a disjunction, but quite often this involves much 
more structure than in the case of lexical ambiguities, because quantificational 
ambiguities are not tied to a particular atomic expression. For instance, the only 
way to represent the ambiguity of (^a) in a disjunctive manner is Q- 
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(1) Every man loves a woman. 

(2) Vx {man{x) — ?> 3y (womanly) A love{x, y))) 

V 3y {woman{y) A Vx {man{x) — >■ love{x, y))) 

Obviously, there seems to be some redundancy, because some subparts appear 
twice. Underspecified approaches such as the Core language Engine (CLE, 0) 
or Underspecified Discourse Representation Theory (UDRT, m allow us to 
represent quantifier ambiguities in a non-redundant way. The corresponding un- 
derspecified representation for is given in (3)0 

( 3 ) 



h 




'ix(man{x) 



3y{woman{y)A / 12 ) 



I 3 : love{x,y) 



This concise representation of the possible readings should allow us to avoid 
the state explosion problem. For representing the semantics of a natural language 
sentence this can be seen immediately, but to which extent theorem proving prof- 
its from underspecified representations is not easily determined. Up to now there 
is no proof theory which can directly work with underspecified representation. All 
of the approaches we are aware of rely, to some extent, on disambiguation. That 
is: first disambiguate an underspecified representation and then apply the rules 
of your proof theory. Once disambiguation has been carried out, this amounts, 
more or less, to classical proof theory, see, for instance, H3 

In |2n| we have proposed a tableau calculus that interleaves disambiguation 
steps with deduction steps so that the advantages of an underspecified represen- 
tation can, at least partially, be retained. 

In addition, it is sometimes not necessary to compute all disambiguations, 
because there exists a strongest (or weakest) disambiguation. If there exists such 
a strongest (or weakest) disambiguation it suffices to verify (or falsify) this one, 
because it entails (or is entailed by) all other disambiguations. E.g., (0 has 
six reading which are listed in @. For each reading we put the order of the 
quantifiers and negation sign as a shorthand in front of it. 



(4) Every boy didn’t see a movie 

(5) (V3-i) \/x{boy{x) ^ 3y{movie{y) A -isee{x,y))) 
(V-'3) \/x{boy{x) -A -<3y{movie{y) A see(x, j/))) 
(-■V3) ->\/x{boy{x) -A 3y{movie{y) A see{x,y))) 
(3V-i) 3y{movie{y) A Vx(6oy(x) -A ->see{x,y))) 
(3-iV) 3y{movie{y) A ~3ix{boy{x) -A see{x,y))) 

(~'3V) -<3y{movie{y) A'ix{boy{x) -A see{x,y))) 



^ Actually, the underspecified representation in (3) differs slightly from the way un- 
derspecified representations are defined in m where the holes are not explicitly 
mentioned. Our representation is a bit closer to |^, but the differences between the 
frameworks are mainly notational. 
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In (0 , we give the corresponding entailment graph which has as its elements 
the readings in Two readings and "0 are connected by if |= ?/;. 



( 6 ) 



(3V^) 

(V3-) 



(V-3) 



(-V3) (3-V) 

(-3V) 



Unfortunately, this graph is not very dense. There are only two pairs of 
readings that stand in the entailment relation. Nevertheless, it allows for some 
improvement of the calculus, as it allows us to filter out some readings. (3V-i) 
and (V3-i) are two of the readings of (0, where (3V-i) entails (V3-i). On the 
other hand, if we are able to derive a contradiction for (V3-i), then we know that 
(3V-i) is contradictory, too. In lEO] we have shown how the subset of readings 
which is sufficient can be identified. 

But there is more to reasoning with quantificational ambiguity than just de- 
veloping a calculus for it. In the presence of multiple readings of premises and 
conclusions, fundamental logical notions such as entailment receive new dimen- 
sions. Should all possible readings of the conclusion follow (in the traditional 
sense) from all possible readings of the premises for the ambiguous conclusion 
to qualify as a consequence of an ambiguous premise? Basic research in this 
direction has been carried out by a number of people j1 ‘/’ittifi TlV’Oj . Ultimately, 
the aim here is to obtain insights into the development and implementation of 
theorem provers for underspecified representations. 



Reasoning with Pronoun Ambiguity. A number of calculi have been pro- 
posed for reasoning with dynamic semantics. 1 i'-i7l4lH present natural deduction 
style calculi for Discourse Representation Theory, and presents a tableau 
calculus. In the area of Dynamic Predicate Logic (DPL, [in|) and its many vari- 
ations, ini presents some ground-tableau calculi and uni a sequent calculus. 
All of these approaches presuppose that pronouns are already resolved to some 
antecedent. Therefore, the problem of pronoun ambiguity does not arise within 
the calculus but the construction algorithm of the semantic representations. In 
order to employ the aforementioned calculi it is necessary that the semantic rep- 
resentation is disambiguated, but this might result in a huge number of readings, 
where the advantage of underspecified representation is lost. Again, it seems rea- 
sonable to interleave disambiguation and deduction steps, where disambiguation 
is only carried out if this is demanded by the deduction method. 

The resolution method EH! has become quite popular in automated theo- 
rem proving, because it is very efficient and it is easily augmentable by lots of 
strategies which restrict the search space, see e.g., 123 ]. On the other hand, the 
resolution method has the disadvantage of presupposing that its input has to 
be in clause form, where clause form is the same as CNF but a disjunction is 
displayed as a set of literals (the clause) and the conjunction of disjunctions is 
a set of clauses. Probably the most attractive feature of resolution is that it has 
only one single inference rule, the resolution rule. 
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Applying the classical resolution method to a dynamic semantics introduces 
a problem: transforming formulas to clause form causes a loss of structural infor- 
mation. Therefore, it is sometimes impossible to distinguish between variables 
that can serve as antecedents for a pronoun and variables than can not. |27tl.s| 
provide a resolution calculus that uses labels to encode the information about 
accessible variables. Each pronoun is annotated with a label that indicates the 
set of accessible antecedents. 

There is a further problem with resolution calculus as it was presented in 
[L!7ll!8j is that it requires backtracking in order to be complete. Unfortunately, 
backtracking is hard to implement efficiently and it spoils some of the appeal of 
preferring resolution over tableau methods. 

A tableau calculus for pronoun ambiguity has been introduced in |3D|. This 
tableau calculus has a number of advantages over a resolution-based approach to 
pronoun resolution, as mentioned above. First of all, it is possible to interleave 
the computation of accessible variables with deduction, since preservation of 
structure is guaranteed in our signed tableau method. This is not possible in 
resolution, because it is assumed that the input is in conjunctive normal form, 
which destroys all structural information needed for pronoun binding. There, 
accessible antecedents can only be computed by a preprocessing step, cf. I'iTf'iMj . 

But the major advantage is that no backtracking is needed if the choice of 
an antecedent for a pronoun does not allow us to close all open branches; we 
simply apply pronoun resolution again, choosing a different antecedent. 



2.2 Lessons Learned 

The brief sketches of recent work on inference and natural language semantics 
given above show a number of things. First, all traditional computational rea- 
soning tasks (theorem proving, model checking, model generation) are needed, 
but often in novel settings that work on more complex data structures. Dealing 
with ambiguity is one of the most difficult tasks for theorem provers, and we 
have seen in the previous section how we can tackle this problem. On the other 
hand, so far we have only looked at theorem proving for quantifier and pronoun 
ambiguity, separately; but what kind of problems arise if one tries to devise a 
theorem prover for a language containing both kinds of ambiguity? We will have 
a closer look at this later on. 

Second, there are novel logical concerns both at a fundamental and at an 
architectural level. The former is illustrated by the proliferation of notions of en- 
tailment and by the need for incremental, structure preserving proof procedures. 
As to the latter, to move forward we need to develop methods for integrating 
specialized inference engines, possibly operating on different kinds of informa- 
tion, with other computational tools such as statistical packages, parsers, and 
various interfaces. We propose to use combinations of small specialized modules 
rather than large baroque systems. Of course, similar strategies in design and 
architecture have gained considerable attention in both computer science |H|, 
and in other areas of applied logic and automated reasoning m- 
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Combining Ambiguities. What happens if we combine both kinds of ambigu- 
ity and try to reason efficiently with formulas that contain ambiguous quantifier 
scopings and unresolved pronouns? Especially, which of the proof strategies that 
we used for dealing with the respective ambiguities can be adopted, and which 
ones raise problems? 

When combining different kinds of ambiguity, ambiguities do not simply mul- 
tiply, but they also interfere. Below, a short example is given, where (0 and its 
two readings in ® , an instance of quantifier ambiguity, is followed by which 
contains an unresolved pronoun. 

( 7 ) Every man loves a woman. 

(8) a. Vrc {man{x) — >■ 3y (woman{y) A love{x, y))) 
b. 3y (womanly) A \/x {man{x) — >■ love{x, y))) 

(9) But she is already married. 

Here, (0 allows us to resolve the quantifier ambiguity of O- Therefore, an 
appropriate calculus has to account for this. © filters out ©a), because it does 
not provide an antecedent for the pronoun she in ®. This is easily seen, as (0 
was uttered in the empty context, and (0a) does not provide any antecedents. 
This implies that (0a) cannot be a possible reading. 

The preceding discussion so far hints at another problem that occurs if we 
try to reason in a combined framework. Considering only quantifier ambiguity, 
it was possible to neglect a reading ip if it entailed another reading ip. Is this still 
possible if there are pronouns occurring in the proof which remain to be resolved? 
Reconsidering o, the reading (0b) entails (0a), and it is sufficient to use only 
(0 a) in the proof. But if (0) is followed by (EJ, then 0 a) does not provide any 
antecedents for the pronoun in o and the pronoun remains unresolved. In fact, 
according to the discussion above, (0a) would be filtered out, just because it 
cannot provide any antecedent; but then, no reading is left. (0b) is filtered out, 
because it is stronger than 0 a), and 0 a) is filtered out for the reasons just 
given. 

An obvious way out is to prefer weaker readings over stronger ones without 
throwing the stronger reading away. Only if the weaker reading does not cause 
any unresolvedness of pronouns, one can fully dispense with the stronger reading. 
For a longer discussion of this problem and some ways to solve this, the reader 
is referred to m 

Incrementality. Implementations in computational semantics that employ the- 
orem provers normally state the inference tasks in a non-incremental way. For 
instance, Doris filters out those readings of a natural language discourse that 
do not obey local informativity or local consistency constraints. In this process 
of filtering out readings, the system is often faced with very similar reasoning 
tasks involving very similar sets of premises and conclusions. In Doris, these 
tasks are treated independently of each other, and every inference task is started 
from scratch. The set of formulas which are treated multiple times grows with 
the length of the discourse. Of course, this redundancy significantly decreases 
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the efficiency of the implementation, and it will prevent the system from scaling 

up. 

introduce a way of stating these inference tasks such that redundant 
applications of inference rules can be avoided. This is accomplished by taking 
context and the way contextual information is threaded through a discourse 
explicitly into account. The approach in mm is based on formal theories of 
context, see, e.g., Bim . 

3 Further Directions and Challenges 



The findings of the previous sections are supported by a number of further and 
novel developments in more applied areas adjacent to natural language seman- 
tics. We will restrict ourselves to three examples. 

First, in syntactic analysis, partial or underspecified approaches to parsing 
are becoming increasingly popular Just like underspecified representations 
in semantics, a partial parse fully processes certain phrases, but leaves some am- 
biguities such as modifier attachment underspecified. Given this similarity, it is 
natural to ask whether underspecified semantics can somehow be combined with 
partial parsing. An ongoing project at ILLC studies to which extent one can, for 
instance, use semantic information into account to resolve syntactic ambiguities; 
see http://www.illc.uva.nl/~mdr/Projects/Derive/ for details. Note that 
combinations of underspecified representation and packed syntactic trees (parse 
forests) have been considered before Ena, but no methods for using semantic 
information to resolve syntactic ambiguities are reported there. 

Second, assuming that underspecified representations can usefully be com- 
bined with partial parsing, we may be able to improve methods in Information 
Extraction (IE). Common approaches to IE suffer from the fact that they ei- 
ther give only a very shallow analysis of text documents, as in approaches using 
word vectors, or that they are domain dependent, as in the case of template 
filling. More general techniques using some kind of logical representation could 
circumvent these disadvantages. Now, IE techniques provide the right data struc- 
tures, but to access the information one needs the right retrieval algorithms. 
Logic-based Information Retrieval (IR) has been around, at least theoretically, 
since the mid 1980’s |3H|. An ongoing project at ILLC investigates to which 
extent underspecified reasoning and representation can be used for IR; again, 
see http://www.illc.uva.nl/~mdr/Projects/Derive/ for details. We do not 
believe that these techniques can compete with IR methods for very large data 
collections, where logic-based techniques seem to be intractable, but we are con- 
fident about substantial quality improvements for smaller domains. 

In this context, it seems interesting to investigate to which extent Description 
Logics can be employed to represent the content of a document. m consider 
a fragment of Montague Semantics (EHI) that can be expressed in Description 
Logics. Formulas belonging to this fragment have to be quantifier-free, meaning 
that they do not contain any lambda abstractions. For instance, (II Ul bL which 
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is the semantic representation of (da) belongs to the fragment, but drub), 
representing da), does not. 

(10) a. Mary read a book. 

b. ( Mary read ( some book )) 

(11) a. Mary read a book that John bought. 

b. ( Mary read ( some ( Xx { x book A ( John ( bought x )))))) 

For this quantifier- free fragment, m provides an inference procedure which 
decides satisfiability in polynomial time. More generally. Description Logics are 
concerned representations and inference algorithms for fragments of first- and 
higher-order logics in which quantification is of a restricted, or guarded nature; 
see, for instance, for further uses of Description Logics in computational 

semantics. One of the important advantages of using Description Logics is that 
very efficient inference tools are available, such as DLP M- 

Finally, and coming from a completely different direction, there is work on 
the use of dynamic semantics to explain the meaning of programs in hybrid 
programming languages such as Alma-0 |2| that combine the imperative and 
declarative programming paradigms. PI shows how dynamic predicate logic 
provides an adequate semantics for a non-trivial fragment of Alma-0, and how 
inference tools for dynamic predicate logic become verification tools for the hy- 
brid programming language. 

4 Conclusions 

In this note we have identified some of the main concerns of doing inference 
for natural language semantics. One of the most difficult tasks in this context 
is the problem of reasoning with ambiguity. We have seen that it is possible 
to devise calculi which can deal with a particular kind of ambiguity, but that 
things get much more complicated if one tries to devise a calculus which can deal 
with different kinds of ambiguity. We have illustrated these concerns by means 
of samples from ongoing research initiatives, and, in addition, we have listed 
what we take to be some of the main challenges and most promising research 
directions in the area. 
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Abstract. The need to solve structural constraints arises when we in- 
vestigate computational solutions to the question: “in which logics is a 
given formula deducible?” This question is posed when one wants to learn 
the structural permissions for a Categorial Grammar deduction. 

This paper is part of a project started at cni. to deal with the question 
above. Here, we focus on structural constraints, a form of Structurally- 
Free Theorem Proving, that deal with an unknown transformation X 
which, when applied to a given set of components Pi ... Pn, generates a 
desired structure Q. The constraint is treated in the framework of the 
combinator calculus as XPi . . . Pn —» Q, where the transformation X is a 
combinator, the components Pi and Q are terms, and ^ reads “reduces 
to”. 

We show that in the usual combinator system not all admissible con- 
straints have a solution; in particular, we show that a structural con- 
straint that represents right-associativity cannot be solved in it nor in 
any consistent extension of it. To solve this problem, we introduce the 
notion of a restricted combinator system, which can be consistently ex- 
tended with complex combinators to represent right-associativity. Finally, 
we show that solutions for admissible structural constraints always exist 
and can be efficiently computed in such extension. 



1 Introduction 

This paper introduces the concept of structural constraints and studies how 
to efficiently solve them. The problem of solving structural constraints arises 
when one tries to automatically “learn” the structural permissions that can be 
used in inferences in Categorial Grammar. Lest we raise false expectations, we 
hasten to point out that structural constraints are just one step in the learning of 
structural permissions, motivated by our initial work in structurally-free theorem 
proving Pi- 

Linguistic inference in Categorial Grammar is normally performed in one of 
a variety of substructural logics. E.g. in HU the repertoire included four logics 
(L, NL, NLP and LP) parameterized by whether the structural rules of (right-) 

* Partly supported by the Brazilian Research Council (CNPq), grant PQ 300597/95-5. 
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associativity and commutativity were allowed or not. We want here to address 
the problem of automatically learning from an example which structural rules 
(and hence which logic) we are dealing with. 

More specifically, we address the following question: given an antecedent and 
a desirable consequent, can we automatically determine in which substructural 
logics the latter is inferable from the former? This activity has been termed 
Structurally- Free Theorem Proving (SFTP). 

SFTP is a variation of traditional theorem proving introduced in COl, which 
emerges from the work on Structurally Free Logics (SFL) by Dunn and Meyer jOJ 
|7| . SFL is a family of logics free from any structural presupposition — whence its 
name. SFL includes combinators {i.e. A-terms without free variable) as special 
kind of atoms. Combinators represent faithfully the usual structural inference 
rules (contraction, commutativity, etc.) In SFL, these rules are replaced by com- 
binator rules such that each SFL-deducible sequent contains some combinators 
that indicate the structural rules needed for its deduction. SFL is thus strongly 
related to the family of propositional substructural logics |S|; in such a family, one 
logic differs from the others by the structural rules it allows in the deductions, 
i.e. by the rules through which the premises in a proof can be rearranged. 
Possible applications of SFTP to computational linguistics include: 

— Automated learning of structural permissions in the framework of Catego- 
rial Grammar. In particular, SFTP allows one to compute, from a positive 
example, the structural rules needed to make this example true, and where 
they should be applied. 

— Bridging the gap between the deductive approach to Categorial Grammar 
of and Steedman’s Gombinatory Gategorial Grammar taking ad- 
vantage of the presence of combinators as first class elements in SFTP. 

Admittedly, SFTP techniques have to be perfected and refined before these 
goals can be fully achieved. In m it was shown that SFTP can be efficiently 
solved in a fragment containing connectives {•, /} if it can be solved for the 
{•}-fragment; the solution for the {•}-fragment was only hinted at. 

This paper intends to provide a solid basis for SFTP by showing that it can 
be efficiently solved in the {•}-fragment, thus complementing the work of em- 
it turns out that solving SFTP in the {•}-fragment is equivalent to solving a 
structural constraint, that is, an expressions in the A-calculus and combinator 
calculus P of the form 

XPi . . . Pn — » Q 

where X is the unknown structural transformation, Pi...P„ and Q are pure 
terms, i.e. terms built only with variables, and — » is (3- or combinator-reduction. 
A structural constraint is admissible if each variable in Q occurs in some Pi. 
For example, to know the structural rules needed to deduce 

Pi *P2 bp2 • (pi •Pi) 

it suffices to find a combinator that solves the structural constraint 



Xa;ia:2 ^ X2(xiXi). 
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The solution for this constraint, X = W(BC(C(B))) as computed in Section I01 
represents the structural rules for commutativity (C), left-associativity (B) and 
contraction (W). Moreover, it is possible from this solution to reconstruct a 
deduction for the initial sequent mu- 
lt turns out, however, that the path to find solutions to any admissible struc- 
tural constraint is not a simple one. Some admissible structural constraints, 
do not have a solution in the combinator calculus (or, equivalently, in the A- 
calculus), e.g. 



Xx2{xiXi) XiX2- 

If we simply extend the combinator calculus to cope with such deficiency, it 
easily becomes inconsistent (Lemma Ej). 

The main result of the paper is showing that we can construct a consistent 
setting in which all admissible structural constraints have efficiently computable 
solutions, so SFTP can be solved in the •-fragment. 

By overcoming the inconsistency problems (see below), this work paves the 
way to the extension of SFTP techniques of m!i to the more usual {*,7,^- 
fragment of Categorial Grammar. 

We proceed as follows. After introducing the combinator calculus and SFL 
in SectionO, we can formally define structural constraints in SectionOand show 
that the class of simple constraints have simple, efficient computational solu- 
tions. We then present an algorithm (Algorithm that solves simple structural 
constraints. 

However, Section El shows that the full class of complex structural constraints 
does not have a solution in the standard combinator system nor in any extension 
of it. To solve this problem, we introduce in 14. li the notion of a restricted combina- 
tor system and show that the problems of standard combinators are absent from 
it. Finally, we introduce in 14.21 mmn/c.x combinators and show how all admissible 
complex structural constraints has a solution using complex combinators in H3I 
this solution is shown to be computable in time 0{N'^). 

2 Background 

2.1 Combinator Systems 

Given a set C of basic combinators, a combinator system is a pair (Cf.,-»p), 
where the set C is the basis of the system, C^. are the combinator terms based on 
C, and is the reduction relation associated to C. We usually refer to C as the 
combinator system. When the basis C is clear from the context, the system is 
represented just by (C, -»); in this section, we use by default the set of primitive 
combinators Co = {K, S, W, C, B, 1} described below. 

To define the set of terms, consider V = {x, y,z,. . .} a countable set of 
variables; both variables and combinators are atomic terms. The set C of com- 
binator terms is the smallest set that includes all atomic terms, and if P and Q 
are terms, then the application {PQ) is also a term. 
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A term P is pure if it contains only variables. A (compound) comhinator is 
a term that contains no variables. V ar{Q) represents the set of free variables 
occurring in the term Q. We abbreviate x = x\ . . . x„; P = P\ . . . and 

QP = QPi . . . P„ = ((. . . ((QPi)P2) . . . )Pn), 

that is, application associates to the left, x also represents the set {xi, . . . x„}; 
context always separates both uses. P{x) represents a term P where Var{P) = 

X. 

In P = Pi . . . we always assume that Pi is an atomic term; at the end of 
Section 0 we show how this restriction can be overcome. 

Each combinator X S C is associated to a basic reduction rule of the form 
XPo-» Q defining a binary relation between terms, where XP is called a redex 
(reducible expression). Each X S C is assumed to be functional, i. e. if P is pure, 
there is at most one Q such that XPo-^ Q. Figure Q presents a basic reduction 
rule for combinators in Cq = {K, S, W, C, B, I}; the choice of letters is historical. 



BPQR P{QR) 


CPQR o-^ PRQ 


IP o-^ P 


WPQ PQQ 


SPQR PR{QR) 


KPQ P 



Fig. 1. Reduction Rules for Primitive Combinator 



For a given set C the reduction relation — » is the smallest binary relation 
containing o-» and closed under: 

— reflexivity: P ^ P; 

— transitivity: P ^ Q and Q R implies P ^ R] 

— congruence: if P -» P' then PQ -» P'Q and QP QP' . 

The number of redexes in a term P is represented by NRedex(P). P is in normal 
form (nf) if NRedcx{P) = 0. 

The (weak) equality on terms {=w,C: or only =„, when C is clear) is a binary 
relation obtained by the reflexive, symmetric, transitive and congruent closure of 
the basic reduction rules. A combinator system is inconsistent if P =w Q for all 
P,Q G C', otherwise it is consistent. A system is CR {i.e. it has the Church-Rosser 
property) if whenever P ^ Q and P ^ Q' there exists R such that Q ^ R and 
Q' R. It is well know that the Cp-system is CR and consistent p. 

Combinators can be defined as terms in the A-calculus without free- variables, 
as shown in Figure |3 Also, any A-term has an equivalent combinator term built 
up using only the combinators S and K [p. In this work, we will treat combina- 
tors as primitive entities, not as derived A-terms, as when they were originally 
proposed m- 

A combinator X is definable in a basis C if there exists a X' G such that for 
all pure terms P, Q, XP — » Q iff X'P — » Q. A set of combinators is independent 
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B = \xyz.x{yz) 


C = Xxyz.xzy 


1 = Xx.x 


W = \xy.xyy 


S = Xxyz.xzpyz) 


K = Xxy.x 



Fig. 2. A-calculus combinators 



if none of the combinators is definable in terms of the others. The primitive 
combinators in Figure ^ are not independent from each other. All combinators 
are definable in terms of S and K. Also, S is definable in terms of W, B and C 
as B(BW)(BC(BB)). K is independent from all others. 

So, among the primitive combinators there are two independent, combinato- 
rially complete sets, namely {K, S} and {K, W, C, B}; that is, all combinators can 
be defined in terms of them. The combinator I is definable in both sets (as SKK 
and WK) but is very useful and is normally added to the bases; note that if we 
drop the combinator K, {W, C, B, 1} is also independent, though not complete. 



2.2 Structurally Free Theorem Proving (SFTP) and Structurally 
Free Logics (SFL) 

In the family of logics SFL(C), combinators in C are treated as special proposi- 
tional atoms jZ] . Atomic formulas are propositional letters or combinators (prim- 
itive or compound). For the purpose of this paper, we consider a propositional 
fragment containing only the connective •, known as multiplicative conjunction 
or product or fusion. A formula is pure if it does not contain a combinator. 

To be able to refrain from structural presupposition, sequent deductions in 
SFL have to deal with sequents of the form F \- ip where is a formula and F 
is a structure defined as: 



— every formula is a structure; 

— if F and A are structures, so is (F, A). 

Structures associate to the left, so F,A,E = {{F,A),S). In this setting, the 
sequent rules for • are: 



F\-ip A\-tjj r[ip,ijj] I- X 

(h •) and (• h) 



r, Ah (f ip 



F[if X 



where F[(p,ip] indicates that the structure {if, ip) occurs in F and F[ip • ip] is 
obtained by substituting (p • ip for {(p, ip) in F. Besides those connective rules, 
there is the Axiom rule stating that p> \- ip. 

SFL has combinator rules instead of structural rules. All potential uses of 
structural rules in proofs are accounted by the introduction of a combinator. For 
example, the usual left-associativity rule (below on the left) is replaced in SFL 
by the B-introduction rule (below on the right): 



F[0,{A,S)]hx 



(l-assoc) 



F[e,(A,E)]^X 



(Bh) 



F[0,A,S]hx 



F[B,0,A,F]hx 
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Note that the structure inside [ ] in the lower part of the (B h)-rule, can be 
seen as a redex that reduces into that in the upper part of the rule. Similarly, 
each primitive combinator is associated to a combinator rule and a structural 
rule. Combinator B accounts for left-associativity, C for commutativity, I for 
identity, W for contraction, K for thinning (or monotonicity) and S is also a 
type of contraction, called factoring. There can be rules associated to compound 
combinators too. Figure 0 in Section lOI illustrates several combinator rules and 
their associated structural rules. 

Structurally-free theorem proving (SFTP) jl l)j takes as input an intuition- 
istic (he. combinator- free) sequent Fi, . . . , h (/?, and tries find a combinator 
structure X such that X, Fi, . . . , F„ h is deducible in SFL. The combinators 
occurring in X encode the structural rules needed for the deduction of the orig- 
inal sequent, and so can tell us in which substructural logics it is deducible. In 
the ^-fragment, a sequent X,Fi,... ,F„ h Lp has an associated structural con- 
straint XF/^i . . . Pr„ Qv obtained by deleting • and (structure composition) 
V (he. replacing • and by application) and considering atomic propositions 
as variables. 

However, not every deducible ^-sequent Fi, . . . , F„ h in intuitionistic logic 
has an equivalent X, Fi, . . . , F„ h in SFL(Cq). For example, the rule of right- 
associativity 

has no corresponding combinator in Cq . Section 0 shows how a different com- 
binator system completely represents all intuitionistic deductions and allows for 
the efficient computation of its associated combinator. 

3 Structural Constraints 

Let C be a generic basis and let X be a metavariable standing for an unknown 
combinator. A structural constraint on X has the format: 

Wxy{X P{x) ^ Q{y) ) (1) 

The “\/x y ” in (0 means that X must also be a solution for any substitution 
oixy. Constraint m is admissible if Var(P) = a; A y. It is a simple constraint 
if P = x; otherwise it is a complex structural constraint. 

Simple structural constraints have the format X x ^ Q and are easily solv- 
able. 

Lemma 1. A simple structural constraint has a solution iff it is admissible. 

For a proof, see P Corollary 2.1.24]. Section mi below presents computa- 
tional solutions. Note that there may be more than one solution for P). In 
particular, any X' that reduces to X will also be a solution. 

We start by concentrating on the solution for simple constraints and leave 
the solution for complex constraints to Section P 
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3.1 A Solution for Simple Constraints 

We start by concentrating on the computation of solutions of simple structural 
constraints of the form Xa; ^ Q, which is equivalent to finding a combinator 
corresponding to Xx.Q. There are many algorithms in the literature that solve 
this problem: 

— Curry’s initial formulation used the basis {K,S,I} 0, but the number of 
combinators in it was exponential with the size of x. 

— To reduce this inefficiency, Curry et al iE] introduced the combinators C 
and B in the algorithm, reducing the space complexity to be quadratic with 
the size of x. 

— More recent results by Turner, using so called non-standard bases, were able 
to reduce space to a linear dependency HS|. Also, the time complexity was 
reduced to A log A. 

However, none of these methods generates a solution that is extensible to the 
solution of complex structural constraint . So we introduce a solution of our own 
for the simple case and in Section 0| we will show how it generalises to the case 
of complex constraints. 

3.2 List Combinators 

We first introduce some abbreviations. The first is the deferred eombinator pre- 
sented originally in 0. For X G Co — {1} define X(^) as: 

X(i) = X = BX(i) 

Intuitively, X(j) defers to the right the application of X by i — 1 position^ For 
example. 



C(i)XoXia:2a;3a;4 = Cxoa:ia;2a;3a;4 ^ xqX2XiXzXa 

C(2)a;oXia:2a;3a;4 = BCxoa:ia:2a;3a;4 ^ xqXiX^X2Xa 

C(3)2;o2:ia:2a;3a;4 = B(BC)xoXia: 2 X 3 ^ ^ xoa;ia;2X4r^, etc. 

We can now deal with the infinite basis Cg = {X(j) | X G Cq — {I}, i > 1}. An 

important property of the combinators in Cg is that they are head-invariant^ 
X is head-invariant if, for any redex of the form XxP with x ^ Var(P), there 
exists T such that XxP xT and x ^ Var{T). 

The second abbreviation we introduce is that of a list combinator. The nota- 
tion for a list is (Xi, ... ,X„) , with () representing the empty list and Xj G Cg. 
We define: 



0=1 

(Xi,X2,... ,X„) =Xi (X2,. 



,X„) 



^ 0 actually sets X(o) = X; our 1-based notation suits list combinators better. 

^ 0 calls these regular combinators; “head-invariant” was taken in similarity with the 
notion of head normal form in PQ. 
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Note that (X) = X () = XI . This definition is isomorphic to that of a list data 
structure, the first clause defining the empty list and the second list construction. 
The list combinator (Xi,X 2 , . . . ,X„) actually abbreviates the term 

Xi(X 2 (... (X„(l))...)) 

We may think of the combinators in a list as sequentially applied left-to-right 
on a 1-based input string. For example 

(W(3),C(2),B(3))a;ia;2a;3 ^ (C( 2 ), 6 ( 3 )) xia:2a;3a;3 ^ xix^iX2X^ 

^0XIX3{X2X3) ^ XiX3{x2Xz) 

so (W( 3 ), C( 2 ), B( 3 )) defines S. Note that if X is head-invariant, X(„) and (X(„_|_i)) 
are interdefinable. The solutions we will compute for simple constraints will 
always be list combinators, and that will guarantee its extensibility to complex 
constraints. 

If X^j and X ^2 are two list combinators, their concatenation is represented 
by Xlj • X ^2 and defined inductively as the usual list concatenation: 

0 -Xi = Xi (Xi,X 2 ,... ,X„) .Xi = Xi((X 2 ,... ,X„) -Xi) 

As a result, (X^,^ , . . . , X^^^ ) * , . . . , ^mq ) — '^np : '^mi ; • ■ • j ^niq ) 

and • is associative. We write Xi • (X2, . . . ,X„) to represent (Xi,X2, . . . ,X„) . 

Lemma 2 (Concatenation Lemma). Let Xi^,X ^2 be list eombinators, and 
let P, Q be pure terms. Then ■ X^.^)P Q iff there exist pure O such that 

X]^qP O and X^^O ^ Q 

Proof. By induction on the length of X^^. The base case is X^^ = () , which 
holds trivially. For the inductive case, first consider (=J>). Assume that (Xi • 
Xli) ■ Xl^P -» Q. Then there exists a T with X\xP xT, such that 

{Xi-XL,)-XL,P^Xi-{XL,-XLffP ^XffXL,-XL,)P -^{XLq-XLffT 

This last ^-step holds because Xi is head-invariant. By the Church- Rosser prop- 
erty, (Xlj - Xi^ffT Q and then, by induction hypothesis, there exist pure term 
O such that 



XljT ^ O and Xl^O -» Q. 

But because Xi is head-invariant, (Xi • Xli)P -» Xl^T O. 

For (<^=), assume Xi • Xl^P O and Xl^O Q. From Xi being head- 
invariant it follows that Xi Xl^P X^^T and, by Church Rosser, X^^T -» O. 
So applying the induction hypothesis we obtain 

{X^q-X^ffT^Q 



Again, because Xi is head-invariant, Xi • (X^^ • X^ffP (X^i • X^ffT Q, 
finishing the proof. □ 
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Finally, define P'', the flattening of a term P, inductively as: 

= x (Px)^ = P^x (P(QP))^ = {{PQ)R)'‘ 

removes all the non-implicit parenthesis of P, e.g. (x(t/(zx)))*’ = xyzx. P*' 
has the form xi . . . Xm, preserving the order of the Xi occurring in P. 

We can now introduce a four-step algorithm^ to solve the simple constraint 
Xx ^ Q{y), for y C X, by transforming Q into x. 

Algorithm 1 (Simple Constraint Solver). 

Input: X and Q{y), with y Q x. 

Output: A list combinator such that X/,a; ^ Q. 

B-phase: Flatten Q into Q''. Generate X^ = . . . , as follows: 

Xs:=(); T:=Q; 

While T is of the form 2i . . . Zi(0i02)C?3 
T ■.= z\ . . . Zi 0 i 0203 ; 

Xb := • Xfl ; 

return Xb ; 

At the end of the B-phase, T = . 

C-phase: Order into by any order ^ on y. Generate the list combinator 
Xc = (C(bi), - . . , as follows: 

Apply the bubble sort algorithm m to Q^. Each bubble inversion 
between yi,^ and yti+i adds a to Xq- 

At the end of the C-phase = yi . . . yi . . . ym ■ ■ - ym and XcQ^ ■ 
W-phase: Eliminate duplicates from Q^, obtaining y. Generate the list combi- 
nator X\Y = (W(ci), ■ ■ • , W(c^)) as follows: 

Xw := () ; * := 1; 

T ■.= Q^; is of the form yi . . ,yi . . . y-m. ■ ■ ■ ym) 

While i <m 

If ti = ti+i then 

Delete ti+i from T ; 

Xw ~ W(i) • Xw ; 

Else increment i\ 
return Xw ; 

At the end of the W-phase, T = yi . . .ym = y, without repetitions. 
K-phase: Add extra variables to y to make it identical to x. Generate the list 
combinator Xk = . . . , as follows: 

Let a; = xi...x„, y = yi . . .ym, m<n\ 

Xk ~ 0 i ~ 1; 

While i <m 

If Xi yi then 

Delete Xi from x; (corresponds to adding Xi to y) 

Xk ~ Xk ■ {^(i)) ; 

^ It must be noticed that such a method was hinted at, without details, proof of 
correctness or list notation, by [IJ Section 5B1]. 
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Else increment i; 
return Xk ; 

At the end of the K-phase, x — y. 

Return: ' ^b- 

Theorem 1 (Simple Constraint Solution). Any admissible simple struc- 
tural constraint of the form Xx — » Q{y), y ^ x, has as a solution a list combi- 
nator computed by Algorithm^ 

Proof. It suffices to note that, in Algorithm [D 

— In the B-phase, XbQ^ -» Q- 

— In the C-phase, 'AcQ^ ■ 

— In the W-phase, Xwy Q^- 

— In the K-phase, Xkx y. 

Then, by three consecutive applications of the Concatenation Lemma we get 
{Xk ■ Xw ■ Xc ■ Xb)x Q{y). □ 

As an example of the application of Algorithm ^ consider the structural 
constraint of Section Q Xxy y{xx). B-step gives yxx and (B(2)) ; C-step 
gives xxy and (C(2), C(i)) ; W-step leads to xy and (W(i)) ; and K-step gives () . 
Therefore, (W(ip C(2), C(i), B(2)) =^j W(BC(CB)) solves the constraint. 

It is worth noting that the list format highlights the type of combinators that 
really correspond to structural rules; e.g. the list combinator (W(2), C(i)) repre- 
sent the usage of contraction (W) and commutativity (C), but not of associativity 
(B), even though its “explicit” format is BW(CI). 

With respect to the complexity of Algorithm d steps B, W and K traverse 
the input string only once; therefore those phases are linear with the number 
N of variables (distinct or not) in the input term. Concatenation is also linear. 
Step-C is the bubble sort algorithm and its complexity is 0 {N'^) It is the 
dominating part of the algorithm, whose temporal complexity is then 0 {N'^). 
We could reduce the complexity of the algorithm to 0 {NlogN) by replacing 
combinator C with a “more efficient” version of it, but we will not do it here. 

A final remark on Algorithm d is that its output is not the only list combina- 
tor that solves the constraint. For instance, B(i),C(i)) is also a solution 

to Xxy -» y{xx). 

4 Complex Structural Constraints 

We now face the problem of solving complex structural constraints of the form 

XP Q. 

Unfortunately, Lemma d does not generalise to complex constraints, and there 
are admissible complex constraints without a solution in Cq. In fact, the combi- 
nator rule (B h) represents the structural rule for left-associativity, but there is 
no combinator that represents right-associativity. Let us formally show that. 
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A context M[ ] is a combinator term with “holes” or, more formally, is a 
combinator built extending the set of atomic terms with “[ ]”. If M[ ] is a 
context and P is a term, M[P] is obtained by placing P in the holes of M[ ]. 

A combinator system C represents right- assoeiativity iff there is a context 
M[ ] in it such that M[P{QR)] -» {PQ)R, for any terms P, Q and R. 

Lemma 3. Suppose there is a combinator system C that contains comhinators 
K and I and represents right-associativity. Then: 

(a) R is not CR. 

(b) R is ineonsistent. 

Proof. Suppose there is a context M[ ] such that M[P{QR)] ->* {PQ)R. Make 
P = K, Q = \ and R= xx. If we start reducing the context we get 

M[K(l(a:x))] ^ Kl(a;a:) ^ I 

but if we start reducing the redex \{xx) we get 

M[K(l(a;x))] ^ M[K{xx)] Kxx ->* x 

Since both I and x are in normal form, the system cannot be CR, proving (a). 

The proof of (b) follows directly, because we have just shown that x =w I =w 
y for any x and y. □ 

Since the Co-system is well-known to be both CR and consistent neither 
Co nor any consistent extension of it can represent right-associativity. However, 
we do want to have a consistent system that represents right-associativity, with- 
out throwing away any of the combinators in Cq {e.g. getting rid of K would 
invalidate Lemma 0 . 

To overcome this problem, we propose to restrict the set of combinator terms 
available for representing structural rules. 

4.1 The Restricted Combinator Systems 

We define the set of restricted terms C Cj,: 

— every term in C-normal form is in C^. 

— if Q G Cp and P Q such that NRedeXf.{P) < 1, then P G C^. 

It follows that there is at most one redex in a term in C^. We chose the set 
as the appropriate restriction on allowed terms to represent structural rules, 
concentrating on restricted combinator systems {C^ , -»(. ) . As usual, we write 
(C^,^) when the basis is clear from the context. Equality is defined as usual 
and restricted to ^ • 

Note that the definition above accepts a combinator as restricted only if all 
its reductions to a normal form have one redex. In this sense, the term SIM is 
not a restricted combinator, even though it has a single redex, for SIM ^ M(M), 
which has two redexes. So, apparently, to verify whether a term is a restricted 
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combinators, one will have to try to reduce it totally. However, as we will see 
in Lemma El the handy class of list combinators is guaranteed to be of the 
restricted format and one can easily verify pertinence to that class. But first, 
some nice properties of restricted combinators. 

Lemma 4. Consider the restricted combinator system (C^, for a given basis 
C. Then: 

(a) is closed under i.e. if P € and P Q then Q gC^. 

(b) The combinator system 

(c) is eonsistent. 

Proof, (a) Since all combinators in C are functional, this follows from the defi- 
nition. 

(b) The number of redexes in any term is at most 1, so the system is trivially 
CR. 

(c) Since the system is CR, any two distinct nf-terms are not equal. □ 

What is remarkable about this proof is that its only requirement of C is that 
its combinators be functional. This means we can now extend Cq with functional 
combinators and still remain (restrictedly) consistent. In the rest of this paper, 
we deal only with restricted combinator systems of the form (C^,^). We will 
look for solutions to complex structural constraints in such restricted framework. 

One might think that constraining terms to at most one redex is too restric- 
tive. This is not the case. The problem we are trying to solve is to be able to 
represent the use of structural rules in every intuitionistic deduction by means 
of combinators. It happens that we cannot do that with generic combinators, 
but we can do that with restricted ones, as shown in EOI; see Figure |5| below. 
The weakness of the terms reverts into a benefit, for it allows for more efficient 
algorithms. 

4.2 Inverse Combinators and Complex Combinators 

Our idea is to systematically add new combinators to our system to cope with 
Co’s deficiency in capturing all possible intuitionistic structural rules. Thus, we 
systematically introduce functional inverses of all combinators. 

For each combinator in Cq — {K} we define a corresponding primitive inverse 
eombinatoi^ giving their basic reduction rules as follows: 

PQR \N~^PQQc^ PQ S-^PR{QR)o->* PQR 
Q-^PRQcy^ PQR |-ipo^ P 

The idea behind primitive inverse combinators is to have a combinator X~^P — » 
X corresponding to a standard primitive combinator defined by Xx — » P. Note 
that the primitive inverses are also head-invariant. 

Ej defines the distinct notion of dual combinators, which are combinators that oper- 
ate right-to-left: z{y(xC)) -» y{zx). Unfortunately, duals were represented in (3 by 
the ^ symbol normally used for inverses; but inverses are not duals. 
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identity: 

rm ^ X 






(I H 



left- associativity: 

r[$, H)] h X 



(B h) 



contraction: 

(Wh) 



(C h) 



commutativity: 

r[<i>,E,tp]^x 
hx 

right-associativity: 

r[<P,<P,E]hx 

r[B-\<h, (o^,H)]hx 



(B-^h) 



expansion: 



r[w-\#,^,^']hx 



(w-i h) 



thinning: 

m ^ c 



(Kh) 



Fig. 3. Combinator rules and their associated structural roles 



Combinator represents right associativity, and therefore cannot be rep- 
resent in terms of the combinators in Cq. Combinator W~^ demands two terms 
to be identical in the redex, and is also not definable in Cq. S~^ is also not 
definable in Cq. However, has exactly the same behaviour as C, so we make 
C~^ = C; similarly, = I. 

The inverse of K was ruled out. If K~^ were allowed, it would be possible 
to build a combinator that reduces to any term. In fact, since for any P and Q 
KPQo-^ P, then K~^Po-^ PQ, so K~^l — » Q for any Q. This means that for any 
P and Q, P =w K“^l Q; so is not functional and it trivializes reduction, 
making the system inconsistent. 

We can add these new combinators to Cq and use it as a basis. 

Definition 1 (Complex Combinators). Let Ci = Cq U {S^^, B~^, be 

the set of primitive complex combinators. The complex combinator system is 
the restrieted system , -»(.i ) based on Ci; we will represent it only as (C^, 
when the context of complex combinators is clear. A complex combinator is a 
term in built only with the primitive combinators in Ci. 

The set of relevant structural rules representable by combinator rules in 
SFL(Ci) is illustrated in Figure 0 

A standard combinator is one without the occurrence of an inverse. Equality 
for complex combinator terms is defined is the usual way. We also write for 
(X~^)(q. We then define = {X(q | X G Ci — {I},* > 1}, and build complex 
list combinators with the elements of C*; as in the standard case, X^ G C^. 

We now extend the notion of inverses to a larger class of complex combinators. 
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Definition 2 (Complex Inverses). Let X and X* be a K-free eomplex eombi- 
nators. X* is an inverse combinator o/X whenever 

if XP -» Q then X*Q P 
for all pure terms P and Q, Var{P) = Var{Q). 

It follows directly from Definition 0 that if X* is an inverse of X then X is 
an inverse of X*. Not all complex combinators in have an inverse; apart from 
combinators in which K occurSjthe divergent combinators such as WWW and 
WI(WI) do not have an inversqj according to Definition 0 Nor is the inverse 
unique: (C) has inverses (C) , (C, C, C) , (C, C, C, C, C) , etc. 

By dealing with complex list combinators based on C* , we are guaranteed to 
stay inside the restricted framework, and therefore we have a consistent and CR 
system to represent structural rules, as shown below. 

Lemma 5. Let X^ = (Xi, . . . ,X„) such that each X^ G C(. Then X^ G C^. Lf 
P is pure, XlPgC\ 

Proof. By induction on n. For the base case, just note that () = I G C^. Then 
note that if X 2 G and Xi is head-invariant then X 1 X 2 G because X 1 X 2 is 
normal form, which takes care of the induction case. A similar induction shows 
thatXi,PGCh □ 

We next prove that a K-free complex list combinators always has an inverse 
that is easily computed. 



4.3 Computing List Inverses 

We start by defining a function X“^ on each K-free X G C* (so far, X~^ was 
defined only for X G Cq). Let X G Ci — {I, K} and n > 1. Then: 

( X („))-1 = ( X - 1 )(„)^ XC 4 

It is not difficult to see that for K-free X G C*, X~^ is an inverse of it. Extend 
this definition to list combinators in the following way. Let X^ be a K-free list 
combinator and X G C*: 



(())-'= 0 

((X) .x^)-i=x-i. (X-i) 

In other words, (Xi, . . . ,X„) = (X“^, . . . . 

We still need to show that Xf^ is actually an inverse of X^. First, we need 
to extend the Concatenation Lemma for complex list combinators. 

® but note that these combinators are not equivalent to list combinators. 
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Lemma 6. The Concatenation Lemma holds for complex combinators. That is, 
for C^-based complex combinator lists and o,nd pure terms P and Q, 
(Xij • Xl 2 )P ^ Q iff there exists pure O such that Xl^P O and Xr^O -» Q. 

Proof. Just notice that the proof of Concatenation Lemma (Lemma|2) uses two 
basic properties of the elements of the list: the fact that they are head-invariant 
and that they possess the Church-Rosser property. Both of them are present in 
complex list combinators due to the fact that members of C* are head-invariant 
and Lemma El So that proof of Lemma El applies to restricted complex list 
combinators . □ 



Theorem 2. Let Xr = (Xi, . . . ,X„) be a K-free complex list combinator. Then 
XJ^ is an inverse ofX^. 

Proof. By induction on the length n of the list. The base case trivially deals 
with the empty list. For the inductive case, let P and O be any pure terms such 
that XlP O. 

Since X^ = (Xi) • (X 2 . . . ,X„) , by the Concatenation Lemma, there must 
exist a pure Q such that (Xi) P Q and (X 2 , . . . , X„) Q ^ O. 

From the induction hypothesis, we get (X 2 ,... ,Xn)~^0 -» Q. Also note 
that (Xj"^) Q P. Then, by the Concatenation Lemma, 

X£'o = ((X 2 ,... ,x„)-'. (Xr'))O^P 
finishing the proof. □ 

Algorithm 2 (List Inverter). 

Input: a K-free list combinator of the form (Xi, . . . ,X„) . 

Output: (X-\... ,Xj-^). 

4.4 Solutions for Complex Constraints 

Given an admissible structural constraini0 XP ~» Q, the idea is to find x such 
that Xr^x —>* P and Xr^x ^ Q . The solution will be X = Xf^ • X^^. 

Algorithm 3 (Structural Constraint Solver). 

Input: pure terms P and Q, Var(Q) C Var(P). 

Output: a complex list combinator X^ satisfying X^P -» Q. 

Let X = Var(P). Establish an arbitrary order between the Xi € x, xi ^ 
. . . -< Xn. Then: 

1. Using Algorithm m compute the (standard) list combinator X^^ such that 
Xli3^ P . 



According to the assumption made in Section El in P = Pi ... P„ the term for Pi is 
atomic; see at the end of this Section how to avoid such a restriction. 
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2. Using the list inverter (Algorithm |2I) , compute 

3. Again using list the simple constraint solver (Algorithm [Ql , compute the list 
combinator such that Xl^x ^ Q. 

Return Xl = X^^^ ■ X^^- 

Steps 1 and 3 have the complexity of the simple constraint solver. Step 2 is 
list inverter, with linear time complexity. So Algorithm^ has complexity 0{N'^), 
where N is the maximum of the lengths of P and Q. 

Theorem 3 (Complex Solution). A complex structural constraint XP -» Q 
has a C*-&t solution iff it is admissible; this solution is computed by A laorithml^ 

Proof. If Var{Q) % Var{P) then there is a variable x G V ar{P) — Var{Q), 
and since no combinator creates new variables there cannot exist a solution for 

XP-»Q. 

Now assume Var{Q) C Var{P). Let x = Var{P). We know from Theorem^ 
that there exists a list combinator Xj^^ such that Xj^^x -» P. From x = Var{P), 
it follows that X/.,^ is K-free. From Theorem El we know that X^^ has an inverse 
such that Xf^P x. 

Again by Theorem we know there exists a list combinator X^^ such that 
Xl^x Q (this one may not be K-free). And finally, by the Concatenation 
Lemma, we get that (X£^^ • X^ffP —>* Q, solving the intended structural con- 
straint. To finalise, just notice that (X£^^ • X^^) is computed by Algorithm|3 □ 

Let us show a few examples of the application of Algorithm Q First, the 
solution for Xx{yz) -» xyz is (B^^) (which is the list combinator that defines 
B~^), where the algorithm above gives X^^ as (B( 2 )) and X^^ as () . 

The solution for constraint Xy{xx) xy is C, C( 2 ), W~^) (exactly the 

inverse of the solution for Xxy — » y{xx)). This solution was found by arbitrar- 
ily fixing the order x ^ y, had we fixed y < x the solution would have been 
, W^ 2 ))C). This shows that the solution is sensitive to the order chosen. 
However, note that the primitive combinators used in both solutions are the 
same. This should be always the case, except that the combinator C may be 
eliminated from some solutions, when a particular choice of ^ avoids reordering. 

The constraint X(xy) —» x violates the assumption made in Section 0 that 
in P = Pi .. . Pn, the term for Pi is atomic; in this case P = Pi = {xy), and 
Algorithm0 gives the wrong answer (K( 2 )) for it computes X^^ = () instead of 
(B) . However, a very simple alteration in the algorithm can treat this case by 
changing step 1 with 1'. 

1'. Take a new variable w ^ x and compute X'^^, Xf_^wx — » wP. Now compute 
Xlj by decrementing i in each X(,j) in X'^^; the rest of the algorithm remains 
the same. 

It is not hard to see that we have computed a X/,j such that XJ^P —>* x even 
when Pi is not atomic. In the case of X{xy) x, we compute Xf_wxy w{xy), 
obtaining (B( 2 )) , and X^^ is (B) ; then by computing Xi,.^xy ->* x we get (K( 2 )) 
and the final answer is (B~^, K( 2 )) . 
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4.5 Minimal Solutions 

Algorithm El does not give us “minimal” solutions. For example, given the 
constraint Xx{yz)ab -» x{yz){ab), Algorithm 0 finds the solution X = 
(B(2), B(2) , B(3)). However, X = (6(3)) is also a solution. It is a prefered so- 
lution according to the following definition. 

Let Xi and X2 be two solutions to XP — Q such that 

XiP — » Xi iP 1 1 — » . . . — » Xi nP l,n Q 

X2P — » X2pP 2,1 X 2 ^mP 2 ,m Q 

Then solution Xi is preferable to X2 if {Pi,i, . ■ . Pi,n} C {^2,1, ■ • ■ P2,m}- 

The prefered solution in the example above is easily computed with a pre- 
processing of the input. The non-preferable solution is computed due to the fact 
that the variables y and z occur in x{yz)ab and x{yz){ab) only as the subterm 
yz. We substitute the subterm yz by a new variable, say y' , so that we now solve 
the constraint Xxy'ab xy'{ab), obtaining X = (6(3)) . Such a preprocessing 
was included in our implementation. 

However, the “minimal solution” is not unique. For example, consider the 
constraint Xx{xy) xy. The simplification above does not apply, for x occurs 
both inside and outside xy. Algorithm 0 yields X = However, 

if we do apply the simplification above, making y' = xy, we get the solution 
X = (K(i)) , which is also correct. There is no a priori reason to prefer either 
, or (K(i)) , so in this case the notion of “minimal” solution depends 

on further assumptions. There are several candidates for what constitutes a 
“minimal” solution: 

1. smallest number of K’s (avoid weakening at all cost); 

2. smallest number of W’s and W^^’s (avoid contraction/expansion at all cost); 

3. smallest combinator size; etc. 

Neither of the above guarantees a unique minimal solution for all cases. For 
condition (1), note that it is always possible to have at most one K in the 
solution of XP -» Q: if Var{P) = Var{Q) then Algorithm 0 gives a K-free 
solution. Otherwise, solve: 



XiP -» x{K) and X2X Q 

where x = Var{Q) and K = Var{P) — Var{Q). In this case, neither Xi nor X2 
contains a K, and X = Xi • (K(„)) • X2, where n = |a;|. 

For (2), we can eliminate all the occurrences of from the solution in 
the following way: for each variable occurring more than once in P, let K above 
contain all but one of its occurrences. Then no occurs in X. 

Substitution of a common term by a new variable (even in cases where its 
variables occur elsewhere) decreases the number of W’s in the solution, but no 
minimal number is guaranteed this way. To obtain the combinator with the 
smallest size, one would have to try all the possible orders in Algorithm 0 
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4.6 Implementation 

An implementation of the Complex Structural Constraint Solver was done in 
Java 1.1 and is available in the web at the following address: 

http : //www. ime .usp . br/~mf inger/ sftp/java/ sftp.html 

5 Comparisons with the Literature 

The piece of work in the literature that mostly approximate ours is that of 
Natasha Kurtonina m- There, an algorithm for computing a first-order formula 
expressing the model-theoretic constraint associated to a CG axiom is presented. 
The approach, however, is semantical, based on the three-place relation for sub- 
structural model frames. In that sense, Kurtonina’s work is complementary to 
ours, which is purely syntactical. 

A similar but distinct problem of solvability is found in the literature, as 
studied by Wadsworth UHl- A term M is solvable if there are terms Ni,. . . ,Nr 
such that MNi . . . Nr =w I; solvability has applications in the study of the 
definability of partial functions p. 

In their presentation of SFL, Dunn and Meyer P introduce the notion of 
dual combinators , which operate right-to-left: z{y{xC)) y{zx). More recent 
developments have shown that several combinator systems involving both stan- 
dard and dual combinators are inconsistent (this is indeed the case even 

for systems that do not represent right-associativity). 

What best approaches SFTP in the literature is the notion of a modular theo- 
rem prover for substructural logics, in which a basic proof procedure is presented 
for all substructural logics, differing only in a parameterized element. In jHj such 
parameter is the closure rule for analytic tableaux. In P, this parameter is the 
licensing of the application of a Natural Deduction rule. 

6 Conclusions 

Computing an efficient solution for complex structural constraints shows that 
structurally free theorem provers can be constructed for the *-fragment of sub- 
structural logics. The next step should be to extend the logic SFL to incorporate 
complex combinators and then investigate larger fragments. 

Steps in that direction were given in m, where the {•, /}-fragment is inves- 
tigated; it was shown that all intuitionistic deductions could be represented by 
SFL with the combinators B, C, I, W and K (W~^ can always be replaced 
by K). Note that if we can treat the {•, /, \}-fragment, we will be dealing with 
the basic connectives of the Lambek Calculus, a logic with wide applications 
in Computation Linguistics. SFTP may then become useful to the automated 
learning of structural rules involved in language parsing. 

Another potential area of applications for SFL and SFTP is to bridge the 
logical gap existing between two approaches of Categorial Grammar. On one 

^ Of course, this approach will not be able to solve the methodological differences 
between the two approaches. For instance, the way Steedman’s CCG treats the 
extraction phenomenon is very different from how Moortgat’s CG does it. 
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hand, there is the deductive approach, recently summarized in m On the 
other hand, there is Steedman’s Combinatory Categorial Grammar in which 
certain operations, related to the application of combinators, are introduced in 
the deductive system. We believe that the presence of combinators as first class 
element in SFL may be of use here. 

Acknowledgments. The author would like to thank Katalin Bimbo for much 
appreciated discussions and invaluable comments on earlier versions of this pa- 
per. 
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Abstract. The paper investigates the connection between non-constit- 
uent coordination, as implemented in categorial grammar by means of a 
polymorphic type-assignment to lexical conjunctions, and hypothetical 
reasoning in Categorial Logics. A way of extending the logic is suggested, 
so that coordination can be applied to types depending on undischarged 
assumptions. By a certain “resource manipulation” of assumptions (of 
hypothetical reasoning), a late-discharge is facilitated, leading to what is 
referred to as the basic non- constituent coordination, whereby only ba- 
sic types (and not functional types of any kind) are coordinated. The 
approach also provides a syntactic counterpart to the usual definition of 
generalized meet to higher-order functions into a boolean range, stipu- 
lated in the literature as the meaning of coordination. 



1 Introduction 

An acclaimed advantage of Categorial Grammar (CG) as a theory of the syntax 
of natural language (s) is its smooth syntactic analysis for non- constituent coor- 
dination (NCC); via the Curry-Howard correspondence, the semantics falls out 
naturally too. Thus, coordinations like 

(1) Mary kissed John and hugged Bill 

(2) Mary kissed and Sue hugged John 

are treated on a par. This is done via the type assignment of the universally 
quantified type VX : ((X — >• X) ^ X) (abbreviated below to k) to lexical 
conjunctions. This type is used in conjunction with the following ncc-rule (which 
covers constituent-coordination as a special case): 



This rule abbreviates an instantiation of the bound type-variable X to r, and 
two actual arrow-elimination steps - see below. We use the more self-explanatory 
arrow-notation for categories, instead of the original slash-notation. To allow the 
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whole range of A^CC-derivations (see 0 for the extent of NCC), the ncc-rule 
is accompanied by (forward and backward) composition-rules and type-raising 
rules, leading to Combinatory Categorial Grammar (CCG) | p_5| . As a conse- 
quence of the universal quantification on types in the k type (even if the in- 
stantiation of X is restricted only to certain linguistically motivated types) a 
proliferation of conjoinable types emerges. An alternative route to the composi- 
tion and type-raising rules is obtained by passing to more expressive formalisms, 
known as Type (Categorial) Logics, such as the various variants of the Lambek 
calculus C, which have rules for hypothetical reasoning. In such systems, the com- 
position and type-raising rules becomeO derived proof-rules. For a recent survey, 
see jEj. 

What is the relationship between ncc-derivations and hypothetical reasoning? 
There seems to be no explicit account of it in the literature. In EIDI. there is an 
implicit consideration of this combination. The only example (on p. 114, repeated 
in Section El has the following characteristic: when the ncc-rule is applied, both 
coordinated types had their assumptions all discharged; arrow-introduction rules 
are applied before coordination via ncc. 

Similarly, all the examples in (Chapter Three, Section 3) have the same 
characteristic, and the same goes for 0. Thus, even though the interaction of 
the ncc-rule with hypothetical reasoning is not explicitly stated to satisfy the 
above characteristic, it seems that this is the intended combination of an ncc-rule 
with £. Note that “blind application” of the ncc-rule in £ to types depending 
on undischarged assumptions is indeed not clear. Assuming only peripheral ab- 
straction, after conjoining two types (each of which peripherally depending, say, 
on one assumption), one of the two assumptions ceases to be peripheral and 
becomes nondischargeable for the rest of the derivation. Intuitively, the reason 
is that the two gaps that serve as assumptions should be filled by the same filler 
for the application of the ncc-rule to yield the right result. 

When semantics is considered too, the notation of the ncc-rule is somewhat 
misleading; though all occurrences of r in the ncc-rule have the same category 
type as value (the type r to which X is instantiated in k), these occurrences 
may have different interpretations as the semantic counterparts of their value. 
Thus, in the “colon notation” , one would write more fully 

T : a, k: Xf.Xg.f r\g, t : (3 

[ncc) — 

r : a n p 

under the usual assumption that all semantic domains are closed w.r.t. ‘n’. In- 
deed, in |5j ‘n’ is not further analyzed. However, [El defines conjoining functions, 
following m and | 7 ], according to which 

{Xxi.f(xi) n Xx 2 .g{x 2 )) =df. Xx.{f{x) n ^(a;)) 

^ Though not the third CCG rule, known as functional substitution, nor the “crossed 
composition” variants of composition. 
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(and recursively for Curried functions of higher type). When used for NCC, 
while syntactically functor-categories are conjoined, semantically only result- 
category interpretations are conjoined. This mismatch suggest some syntactic 
operation, under which the semantic definition will fall-out by (a suitable exten- 
sion of) the Curry-Howard correspondence. We mention here that the semantics 
of |15| . as well as our revision presented below, deal only with wide scope of 
quantifiers (under their usual Generalized quantifiers interpretation) w.r.t. coor- 
dination (an example is presented later). For full scope control, an orthogonal 
devise of scope modalities is used, see e.g, | 2 |. 



In the present paper, we have two main purposes. On the more technical 
level, the paper provides an explicit treatment of the combination of the ncc- 
rule with C. In particular, it provides a way of applying the ncc-rule prior to the 
arrow-introduction rules, under a certain side condition manipulating assump- 
tions (thereby also reducing overgeneration), and only afterwards discharging the 
(manipulated) assumption(s). Admittedly, the proposed rule transcends the tra- 
ditional framework of type-logical grammar, where only introduction and elim- 
ination rules for the various operators are used, in addition to structural rules. 
This point is discussed more after the exposition of the proposed system. 



As for the application of £ as a formalism for grammar formulation for NL, 
the proposal here provides another view of NCC. On the categorial level, the 
coordinated categories are always constituents. On the phrasal level, the coordi- 
nated phrases are what we call pseudo-constituents: constituents with gaps. This 
way, non-constituent coordination is perceived as a filler-gap structure. The key 
observation is, that several gaps may be associated with the same filler. This 
is an elaboration of the observation in 0, whereby A^CC-sentences are viewed 
as having shared substructures. The shared substructures (with their gaps) are 
pseudo-constituents, and the intended effect of the application of a revised ncc- 
rule is to identify the two gaps into one gap, by identifying the assumptions (for 
hypothetical reasoning) introduced by these gaps. Upon application of an arrow- 
introduction rule, the conjoined pseudo-constituents have one antecedent, while 
discharging the assumption resulting from the identification; this represents a 
simultaneous discharge of the original assumptions. Finally, by using a filler for 
the gap via an arrow-elimination rule, both original gaps can be perceived as 
having been filled by the same filler. Below, we present the basic non- constituent 
coordination thesis, that in a way eliminates the whole NCC as being different 
in essence from constituent-coordination. It enjoys the additional advantage of 
restricting coordination to basic categories only, thereby avoiding the prolifera- 
tion of types over which universal quantification takes place in n. It also blocks 
some of the well-known overgenerated NCCs by the traditional approach. 
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2 Preliminaries 



We briefly review the basic notions and the notation used. The (associative, 
product-free) Lambek calculus £ |H| is a calculus for deriving valid declarative 
units of the form Ul> A, where U is a sequence of categories, and A is a category. 
The set C of types (categories) is the closure of a finite set B of basic types under 
’ and The meaning of a declarative unit is that the sequence U reduces to 
A. Validity is defined in terms of denotations over a string model (see m)- A nat- 
ural deduction version of the calculus for C is presented in Figure ^ Assumptions 



E) 



(ax) At> A 

Ui >B, U2>(B^ A) 



(UiU2)t>A 

(BU) [> A 



(^ /) 



[/>(B^A)’ 






(^ 



U2>(A^B), Ui>B 
(U2Ui)>A 
(UB)>A 
U>(A^B) 



Fig. 1. The £-calculus 



are enclosed in square brackets and indexed for reference when discharged by 
arrow-introduction rules. The ncc-rule for £ is (ncc) ■ 

Traditional CG is equivalent to the Ajdukiewicz-fragment A, which does not 
have the hypothetical reasoning rules of arrow-introduction (•<— I) and (— >■ I). 
The combinatory-categorial grammar (CCG) is obtained by augmenting A with 
additional specific rules: Type-raising, composition, and substitution. Figure El 
presents two of these rules, the (forwards and backwards) type-raising rules and 
composition-rules. The substitution rule is not needed for the current discussion. 



(<^c) 



C/>(B 

Gi > A -f- 5, U2>Bi 
(G1G2) >A^G 



(<^t) 



(^c) 



[/> A 



Ut>(B 
Uit> A - 



- A) ^ B 
B, U2>B^C 



(C/ 1 C/ 2 ) > A G 



Fig. 2. Additional CCG rules 



For a detailed exposition of CCG see The above rules are validity pre- 
serving and can be derived in £, though not in A. 
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We present below two typical derivations using the ncc-rule in £. By mim- 
icking the derivation in m (p. 114), we obtain a derivation of left-NCC, shown 
in Figure |3 Note that all hypothetical reasoning and assumption discharge in 
the derivation in Figure 0 precedes the application of the ncc-rule. In the latter, 
X in n is instantiated to (s ^ np). An £-derivation of a right- iVCC (0) 



kissed — 

Mary {{np s) <— np) [np]i 



np 



np - 



np 



h 



E 



hugged - 

Sue {{'^P s) - 1 ^ np) [np ]2 
^ np np ^ “ 

and ““ s ~ 

K s np ^ 



E 



np 



ncc 



E 

John 

np 



Fig. 3. A left-ncc derivation in £ 



(3) Mary gave a book to John and a record to Bill. 

is presented in Figure 01 Note the instantiation of X in k to the complex type 
{{{vp -<r- pp) ^ np) — >■ vp). In Figure E]we present an example of an incor- 



a book 

[{vp ^ pp) ^ np]i np to John 

vp <— pp F pp 



vp 



E 



{{vp ■ 4 — pp) 4 — np) — >■ vp 



h 



[{vp 4 - pp) 4 - np [2 



a record 

np 



vp 4 — pp 



E 



to Bill 

pp 



vp 



E 



{{vp 4 — pp) 4 — np) 



vp 



gave 



{{vp 4 — pp) 4 — np) 



and 

vp K {{vp 4 — pp) 4 — np) — >■ vp 



Mary {vp 4— pp) 



np 



np 



vp 



{{vp 4 - pp) 4 - np) 
E 



vp 



E 



Fig. 4. A right-ncc derivation in £ 



rect £-derivation, arising by applying “blindly” the ncc-rule to types depending 
on assumptions. The second, non-peripheral, abstraction is of unclear nature. 
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and is marked with Note that the assumption [np]i becomes peripherally- 
nondischargeable after the alleged application of the ncc-rule. 



kissed — 

(*) Mary [np ^ s) np [np]i 



hugged 



np 



np ^ s 



E 



E 



Sue («P s)-^np [np ]2 
and np np ^ s 

K 



E 



E 



(s •<— np) 

(s np) np 



h 



- 7i 



s np 



ncc 



John 

np ^ 

^ E jjp 



E 



Fig. 5. Incorrect left-ncc tJ-derivation with ncc-rule? 



3 Combining Hypothetical Reasoning with a Revised 
Ncc-Rule 

We now provide an explicit account of the combination of hypothetical reason- 
ing with a revised form of the ncc-rule, which allows the combination of types 
depending on undischarged assumptions. The main idea is, that in order for the 
revised ncc-rule to apply, the two copies of X not only are instantiated in k to 
the same type, but have to be alignable in terms of the (undischarged) assump- 
tions on which these two Xs depencO (if any): In an C-derivation, two instances 
of a category X are alignable iff they depend on assumptions of the same type 
and of equi-peripherality. Let us restrict for a while the discussion to pseudo- 
constituents with (at most) a single, peripheral gap. Alignment is possible in 
any of the following cases: 



1. Both instances of X do not depend on any assumptions, 

2. Both instances of X depend on left-peripheral assumptions (of the same 

type), 

3. Both instances of X depend on right-peripheral assumptions (of the same 
type). 



The revised ncc-rule operates only on alignable types, by identifying the 
(equiperipheral) assumptions on which they depend, and replacing them by a 
new assumption of the same type and bearing the same peripherality to the 
resulting declarative unit. An important effect of this definition is, that any con- 
tinuation of derivation that was possible for each of the coordinated declarative 

^ In this context, whenever dependence on assumptions is mentioned, it is understood 
as undischarged assumptions. 



Hypothetical Reasoning and Basic Non-constituent Coordination 



37 



units (representing pseudo-constituent phrases) is possible for the declarative 
unit resulting by the revised ncc-rule (representing the coordinated pseudo- 
constituent). In particular, any discharge of an assumption (within the given 
continuation) is now replaced by the assumption that results from identifying 
the original assumptions, when an arrow-introduction rule is applied. In the 
general case of dependence on more than one assumption (presented below), 
alignment of assumptions has the same effect but for more general dependency 
on assumptions. 



(ncc) 



Ih > r, t /2 > K, Ui\> T 
{UiU2Ui)\>r 



r assumption — independent 



t/l[H]l>r, (72 > K, U 3 [A] 2 >T 

{U,U,V,[A\,.^)or 



[A\lU\t>T, f /2 C> K, [A]2U3\>T 
([A]i^2(7i[/2(73)>r 



Fig. 6. The revised ncc-rules for C 



The simplified revised ncc-rule (under the single-gap assumption) constitutes 
now of three separate rules, presented in Figure El The notational convention 
adopted is, that assumptions indexed by a single index are present in the original 
resources, while these indexed by an expression containing ‘=’ are generated 
in the proof itself. An assumption [T\i=j is the result of identifying the two 
assumptions [r]i and[r]j. We adhere here to this simplified notation because 
we do not consider examples with iterated application of the revised rule. In 
a more general setup, where such iterations are considered, a better notation 
would be identifying [r]/ and [r]j, for /, J some finite index sets, to produce 
M/Uj! then, [r]i=j would be merely [T]{ij}. Here we stick with the simpler 
notation. As already mentioned, the proposed rule transcends the usual type- 
logical framework. In a more general framework, it might be sensible to detach 
the assumption-identification from uses by the revised ncc-rule, and give it an 
autonomous status, associated with the semantic operation of substitution for 
(free) variables in A-terms, as shown below. I am currently working on such a 
more general framework. At this point, however, I have no worked-out additional 
applications of this operation except within coordination, so it is left as part of 
the revised coordination rule. 

There is no semantic reason for disallowing coordination, say, of np — > s 
and s -ir- np, both having predicates of the form Ax[p(a;)] as their associated 
meanings. However, non-alignability would allow derivation of 

(4) (*) A man who - slept and Mary kissed - , 
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which is syntactically bacfl. Note also, that the suggested rule is in accordance 
with the general point of view of Lambek-calculi as resource-management log- 
ics. Here, assumptions are recognized as special resources, and are given their 
own resource-manipulation rules. This resource-manipulation might be seen as a 
controlled-modulation (as alluded to in cni) regarding a limited amount of the 
contraction structural-rule, absent in its full generality from £, in the context of 
NCC. 



Mary 

np 



kissed 



hu. 



{np ^ s) <— rep [rep]i 



Sue ("P ^ s) ^ rep [np]2 



rep ^ s 



and np 

K 



np ^ s 



[np] 1=2 



np 



■ h=2 



John 

np 



Incc 



Fig. 7. Left-ncc derivation with revised ncc-rule 



Let us inspect some examples of the application of the revised ncc-rule. First, 
note that derivation of constituent-coordination remains valid, as constituent- 
types are always alignable, not depending on assumptions. Figure | 7 | presents the 
derivation for (2). The Zncc-rule discharges the two right-peripheral assumptions 
[np]i,i = 1,2, and the resulting s depends on the newly-generated assumption 
(right-peripheral to the whole NCC) [np]i= 2 , the latter is then discharged by 
the arrow-introduction rule. Note that while the conjoined phrases are indeed 
pseudo-constituents (having an np-gap each in the object position), the cate- 
gories conjoined are s (though depending still on assumptions). 

A regularity observed in the above example is, that immediately preced- 
ing an application of the revised ncc-rule is always an application of an arrow- 
introduction rule, which discharges the newly introduced assumption This 

suggest a somewhat different formulation of the coordination rule, by which the 
arrow-introduction is absorbed into the coordination rule. Such a formulation 
would yield, for the right NCC, 

, [A]iUil>T, U2\>K, [A]2ll3[>T 

(U,U,U,) > (|A].=. ^ r) 

However, anticipating future applications of the various “components” of the 
coordination rule, we find it preferable not load too much on this rule, and keep 
the whole treatment more modular and separable in future. Thus, we stick here 
with formulation proposed before. Furthermore, as also can be seen from the 
example derivations, proof-generated assumptions need to be exempt from the 
“Prawitz normal-form” requirement m, and functor-categories generated by 



3 



There are known exceptions, e.g. who(m) did Mary give - a present and kiss -. 
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discharging a combined-assumption generated by the revised ncc-rule can be 
immediately applied to arguments, without the danger of spurious ambiguity. 

Turning to the general case of alignable categories, with any number of equal- 
type, equal-peripherality assumptions, we refer to the revised ncc-rule as ncch 
(ncc with hypothetical types). The rule is schematic over the numbers of as- 
sumptions. It is presented in Figure |H1 All superscripts are for distinguishing 
types; all subscripts, assumed pairwise different, are for numbering assumptions. 
Note that the special-case rule presented before is obtained, respectively, for the 



(ncch) 






Fig. 8. The revised ncc-rule for general alignable types 



three cases n = m = 0, n = OAm = 1 and n = lAm = 0. Note also how the 
rule identifies corresponding pairs of equi-typed, equi-peripheral assumptions, to 
generate for each pair a combined assumption with the same type and peripher- 
ality. This rule is not an iteration of the basic revised ncc-rule, as there is only 
on instance of k involved in its application. An iterative application of the basic 
rule would need a number of instances of k as the number of iterations, as usual 
in resource conscious logics. 

The revised derivation for a right-ncc is shown in Figure El The derivations 
henceforth are displayed as a collection of sub-derivations (separated by dotted 
lines) for ease of formatting. 

To see further the effect of the revised ncc-rule, consider a derivation of a 
sentence containing “split constituent” ncc (in the nomenclature of |3): 

(5) Mary drove [to Chicago] yesterday and Detroit today. 

The square brackets delimit the constituent, while the underline indicates the 
coordinated phrases. The derivation is presented in FigurelTHl We use the derived 
rule for shortening the derivation. As a more complicated example, in which 
the revised ncc-rule is applied twice, consider the derivation in Figure ^Oof a 
two-sided NCC in 

(6) Mary gave and Sue sold a book to John and a record to Bill. 

An alternative implementation of the idea of identifying assumptions, leading 
to a late discharge within a pure natural-deduction framework, can be obtained 
by using a multi-modal logic with a “bracketing modality” (see UUj, Ch. 4, 
and the references there), that would introduce another composition operation, 
that distributes over products within a bracketed context. This would also relax 
locally the absence of a contracting structural rule. We do not pursue here further 
this possibility. 
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a record 
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np 



{{vp -4— pp) 4— np) — >■ vp 



E=4 



vp 



Fig. 9. Right-ncc derivation with revised ncc-rule 



Our scheme blocks the derivations in El, of sentences like 

(7) (*) A man who - slept and Mary kissed John. 

Any attempt to derive this sentence by applying the revised ncc-rule will have 
two conjoin the phrase - slept of category s depending on a left-peripheral as- 
sumption, and the phrase Mary kissed John also of category s, but not depending 
on any assumptions. Thus, the alignability condition is violated. At this point, 
there is no characterization of exactly which over-generations are blocked. How- 
ever, in view of the interesting non-blocking of the following bad sentenc^ 
derivable also by the original ncc-rule, it is clear that the current rule is still 
a coarse approximation to the ultimate coordination rule within type-logical 
grammar. 

(8) (*) The mother of and Bill thought John arrived. 

The derivation is shown in FigureEl Clearly, alignability is not strong enough to 
block such examples, which depend on how assumptions are used, not only where 
they are located. Thus, a finer equivalence relation among assumptions is needed 
to block more over-generated phrases; once a better approximation is obtained, 
an exact characterization should be sought for any remaining non-blockings. 



attributed in 0 to Paul Dekker. 
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Fig. 10. Derivation of “split constituent” right-ncc with revised ncc-rule 



3.1 The Semantics of Coordination 

Turning to the semantics of the revised ncc-rule (under the assumption of one 
peripheral gap only), suppose the meanings of the two conjoined types are a(x\) 
and I3(x2), where X\,X 2 are the meanings of the undischarged assumptions the 
respective coordinated types depend upon, say and [...] 2 - Then, the meaning 
of the generated assumption [...]i =2 is x, where x is fresh variable, and the 
meaning of the resulting conjoined type is a(xi) □ (3(x2)[xi := x,X 2 '■= a;], 
amounting to a(x) □ /3(x). Here a[x := y] means the substitution of y for all free 
occurrences of a; in a. This semantic rule captures the intention of identifying 
the two assumptions semantically. After applying an arrow-introduction rule, 
say 4— Ii=2, the obtained meaning of the result becomes Xx.(a(x) □ (3(x)), the 
exact meaning stipulated by Steedman as holding by definition. The substitution 
of a fresh variable for all occurrences two different free variables can be seen as 
the extension of the Curry-Howard correspondence to the syntactic operation 
manifesting itself in the revised ncc-rule. 

To see the relative scope of quantification and conjunction, consider the vari- 
ant (0 of J3). 
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Fig. 11. Derivation of two-sided ncc with revised ncc-rule 
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s ■«— (np s) I , 

i ^ 72^4 John 

((s ^ (np s)) ^ np) np arrived 

■«- E 



(s (np ->• s)) 



(np s) 



E 



Fig. 12. Derivation of Dekker’s counterexample with revised ncc-rule 



(9) Mary kissed and Sue hugged a boy 

Interpreting a boy as the usual generalized quantifier XP[3x{boy{x)AP{x))], and 
resorting to a derivation similar to that in Figure |71 (but with an additional 
type-raising of the object np), we get the meaning of the coordinated phrase as 
follows: 

3x{boy{x) Akiss{M ary , x)Ahug{sue, a;)) 

Thus, the existential quantifier takes scope over conjunction. Similarly, for the 
Geach ^ sentence dm 

(10) Every girl loves, and every boy detests, some saxophonist 

We get the reading in which the existential quantifier takes scope over the two 
universal quantifiers (as well as over the conjunction). The reading where the 
existential quantifier has the lowest reading can also be obtained, as mentioned 
before, by using scope modalities. However, the bad reading, where two, possibly 
different saxophonists, one related to the girls and and the other related to the 
boys, are involved is not generable due to gaps identification. In m, Steedman 
suggest to abandon the GQ interpretation of indefinites in order to block the bad 
reading of (1 1 1 III . resorting to some referential interpretation. This is not needed 
in the current approach to NCC . 

We note while passing that a treatment of NCC using open formulae is 
hinted at in j^. There, the main argument is about the role of variables in 
the semantics, and the main issue is pronoun binding. In a brief consideration 
of an NCC example, Jacobson hints towards a conjunction of formulae with 



44 



N. Francez 



free- variables, but nothing relates it to hypothetical reasoning. The semantics of 
conjunction is stipulated similarly to Steedman’s. 



Ml 



[{vp ■«- pp) ^ np\3 

Vp pp 
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a book 

np 

■<— E 
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to John 




a record 
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to Bill 

pp 
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np 



and 

Mi =2 [(«P t- pp) •«- np]3=4 s ^«r s 



gave 



np ^ s 



E=2 



ncch 



(vp ■«— pp) np {{vp ■<— pp) -4— np) — >• (np — >■ s) 



^ 3=4 



np ^ s 



E 



Fig. 13. An alternative right-ncc derivation with the ncch-rule 



4 Basic Non-constituent Coordination 



Consider the derivation in Figure II 3L It uses two left-peripheral assumptions, 
hence ncch is indeed needed. Note that the coordinated type is s. This is not 
accidental! By resorting to ncch, one can always introduce sufficiently many 
assumptions, so as the pseudo-constituents coordinated are open sentences (with 
a semantics containing free variables for all assumptions) . There is one exception, 
though, for np-coordination, having a non-distributive predication, like Mary and 
John met. For a recent account of boolean semantics for np-coordination (both 
collective and distributive) see ^2j. We may, therefore, conclude, that nothing 
is lost if the logic forbids ncc-coordination of functional categories, restricting it 
to base categories only (s and np in the current version of C employed here). 

Basic non-constituent coordination thesis: Only (and all) base- 
categories can be coordinated. 
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There is no distinction anymore between constituent and non-constituent 
coordination, and ncch applies equally to both, restoring the attractive similarity 
between the derivations of(l) and (2). 

In 0 there is a short discussion of the role of variables in model-theoretic se- 
mantics. Jacobson draws a distinction between a semantics in which ’’variables do 
real work”, and the case in which they do not. The distinction is drawn, though, 
based on two operations supported by variables: being bound (or abstracted) 
and denoting a value via variable-assignments. We would like to emphasize that 
in our extension of C with ncch, variables are meanings of resources, and the 
operations on them correspond to resource manipulation allowed by the sub- 
structural logic. Thus, the identification of different variables via substitution, 
corresponding to assumption-identification by ncch, should render the current 
semantics as belonging to the sort in which “variables do real work”. The real 
work is obtained as an extension of the traditional Curry-Howard correspon- 
dence. 



5 Conclusions 

In this paper, we presented a new approach to the treatment of non-constituent 
coordination in type- logical grammar. The approach is based on a rule, deviat- 
ing from the standard rule landscape of introduction, elimination and structural 
control rules. It is based on view of assumptions used for hypothetical reasoning 
as resources amenable to their own resource manipulation; in this case, coalesc- 
ing different assumptions, of equal type and peripherality, to one, in the context 
of the coordination rule. This leads to an ability to coordinate types depend- 
ing on undischarged assumptions, contrary to current practice in the literature. 
Ultimately, this leads to the so-called basic view of NCC, restricting the poly- 
morphic type in the coordination rule to range over basic categories only. 

There is room for much further research, in which I am currently engaged. 



— There is a good reason to disassociate the assumptions identification from 
the coordination rule. As this is a syntactic counterpart of substitution in A- 
terms, extending the Curry-Howard correspondence, it may have additional 
applications. 

— There is a need for refining the alignability condition used in the current 
proposal, thereby obtaining a finer equivalence among assumptions for hy- 
pothetical reasoning. This will have an improved blocking of over-generated 
phrases. 

One direction to look at is in where a notion of parallelism between 
coordinated phrases is proposed, based on having the same structure of con- 
stituency dependency. The latter reflects the way assumptions used as a 
major premiss in aapplications of rrow-elimination rules are discharged. 
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— In Emms shows that having general polymorphism for V- types (though 
with empty untecedents in declarative units) leads to undecidability. It 
should be interesting to check whether confining polymorphism to basic types 
only affects decidability. 

— Additional instances of NCC should be investigated. One such topic is NCC 
using subject-control verbs. Thus, we would like to be able to generate a 
proper interpretation of GB while we would like to block dnj. 

(11) John urged Bill and persuaded Mary to go 

(12) (*) John persuaded and promised Mary to go 
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Abstract. One may identify two main approaches to the description 
of type hierarchies. In total specification, a unique hierarchy is descri- 
bed. In open specification, a set of constraints identifies properties of the 
hierarchy, without providing a complete description. Open specification 
provides increased expressive power, but at the expense of considera- 
ble computational complexity, with essential tasks being NP-complete 
or NP-hard. In this work, a formal study of the structural and com- 
putational aspects of open specification is conducted, so that a better 
understanding of how techniques may be developed to address these com- 
plexities. In addition, a technique is presented, based upon Horn clauses, 
which allows one to obtain answers to certain types of queries on open 
specifications very efficiently. 



1 Introduction 

In recent years, grammatical formalisms based upon constraint-based parsing 
within the context of typed feature structures have become central within com- 
putational linguistics, the most prominent undoubtedly being HPSG As a 
result, numerous computational frameworks specifically designed for constraint- 
based reasoning on typed feature logics have emerged, among them ALE |^, 
TFS I2H, and CUE (7]. Likewise, systems for representing and managing lexical 
information in a hierarchical fashion have appeared in recent years 0. To fun- 
ction efficiently, all of these frameworks must first and foremost be capable of 
managing the associated type hierarchy effectively!! 

^ By “type hierarchy,” we mean the static hierarchy of (parameterless) types which 
underlies the typing mechanism of the feature structures. We do not include the 
recursively specified types which also form an integral part of grammars which are 
specified using such systems. In CUF |3, the static types are called types, and the 
(often parameterized) recursive types are called sorts, but there is no universal ag- 
reement, and other terminology, or even the opposite terminology, is often used. 
Pollard and Sag m, for example, use the term sort to characterize that which we 
call type. 
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All type hierarchies share some basic properties, such as the ability to ex- 
press inheritance. However, beyond that, there is relatively little agreement as 
to which properties are appropriate. It seems clear that these difference arise not 
because some systems embody the “correct” formalism while others do not, but 
rather because each system places distinct expectations upon the hierarchy. As 
with all forms of knowledge representation m, there is a fundamental tradeoff 
between expressiveness and tractability in formalization of type hierarchies. In 
this paper, we examine aspects of the tradeoff which occurs when one moves 
from total specification, in which the entire hierarchy is specified explicitly, to 
constraint-based open specificatioriU in which the actual hierarchy is not com- 
pletely specified, but rather may be of the models of a set of constraints. 

It is fair to say that total specification is used far more frequently than is 
open specification. Indeed, the only system which we know of which supports 
open specification is CUF0 It appears to be the case, at least in part, that 
open specification has been avoided because the computational overhead is per- 
ceived to be too high. This perception is reinforced by the fact that that the 
question of model existence in such a context is NP-complete El- Nonetheless, 
we feel that open specification can be a useful tool in some contexts, and it is 
therefore important to understand more about its representational aspects and 
computational complexity. In this paper, we take some steps towards this end. 

The paper is organized as follows. In Sec. 2, a brief summary of the represen- 
tational aspects of completely specified hierarchies is presented. Such a summary 
is important because even within that context, there is a critical distinction bet- 
ween distributive and nondistributive hierarchies which must be understood to 
appreciate fully the various aspects of open specification, since the latter is car- 
ried out in a distributive context. In Sec. 3, the basic ideas and structural aspects 
of open specification are presented. In Sec. 4, representations which are essential 
for the process of computing solutions to an open specification are developed. 
Finally, in Sec. 5, some efficient techniques for identifying properties of models 
in open specifications, based upon Horn clauses, are developed. 

This paper may be viewed as a complement to P). In that work, the focus 
was largely upon developing the machinery necessary to show the satisfaction 
problem to be NP-complete. In this paper, we focus more on structural and 
algorithmic issues. 



^ In [21 1 Sec. 3.1], the term open specification is used in a different way, in a discussion 
contrasting ALE to Troll P. We see this terminological overloading as a non- 
issue, provided authors are careful to indicate which definitions are in force in their 
writings. 

® In his work Ait-Kaci Q. 121 has used a crown construction to complete hierarchies 
which may not have all gib’s. However, this technique always associates a unique 
completion with a specification. Thus, we regard it as a means of extracting a total 
specification from an incomplete specification, rather than one of open modelling. 
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2 The Role of Distributivity 

This paper is largely about the open specification of distributive hierarchies. Ho- 
wever, to put that work in perspective, it is first necessary to identify the reasons 
why distributivity is important. In this section, we present a brief overview of 
the rationale for requiring distributivity in ordinary, complete hierarchies. 



2.1 Bounded and Distributive Lattices 

A lattice is bounded if it has a greatest element T and a least element _L. It 
is always assumed that T and _L are distinct, so under this definition, a lat- 
tice must have at least two different elements. As a notational convention, 
a boldface symbol (e.g., L) will denote the entire algebraic structure, while 
the roman symbol L will denote just the underlying set of elements. Thus, 
we write L = (L, V, A, T, T). A lattice is distributive ^21 P- 30] if it satisfies 
{ayb)f\c = (oac)v(6ac) for all elements a, b, and c. 



2.2 General Semantics for Total Type Hierarchies 

The syntactic component of a type hierarchy is just a finite bounded lattice 
(not necessarily distributive) . However, a type hierarchy has a semantics as well, 
which identifies a collection of objects, together with a specification of which 
objects belong to which types. More formally, let L = (L, V, A, T, T) be a finite 
bounded lattice. A type semanties for L is a pair S = (if, 3), in which if is a 
nonempty set, called the universe of objeets, and 3 : L — > 2^ is a function which 
associates a subset of il to each type in L, subject to the following conditions 
that 3(T) = it, 3(T) = 0, and for ti,T 2 G T, n < T 2 implies 3(ti) C 3(t2). The 
last condition expresses the critical concept of inheritance', if ti < T 2 , then every 
instance of t\ is also an instance of T 2 , and so every property which applies to 
an object of type T 2 also applies to (i.e., is inherited by) every object of type ti. 

The semantics (it, 3) separates {ti,T 2 } G L if 3 (ti) yt 3 (t 2), and is totally 
separating if it separates every pair {ti,T 2 } G L. Note that {T,T} must be 
separated by any semantics (it, 3), since il is required to be nonempty. 

Formally, a type hierarchy is a pair ij = (L,(it, 3)) in which L is a finite 
bounded lattice and (it, 3) is a semantics for L. 



2.3 Natural Meet Semantics 

The fundamental operation in constraint-based parsing strategies is unification; 
the unification of two objects x and y results in a an object xLiy which has all of 
the attributes of both x and y. In the context of typed unification, this means in 
particular that xLiy is of both the type of x and the type of y. Thus, if x is known 
to be of type Tx, and y of type Ty, then xUy must be of type Tx A Ty. It follows 
that a type hierarchy which is used for constraint-based parsing must have the 
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following property: A type hierarchy Sj = (L, (it, 3)) satisfies the natural meet 
semantics if the following condition is satisfied for every ti,T 2 G L: 

3(n A T 2 ) = 3(ti) n 3 (t 2) (nat-A) 

A lattice L is said to admit natural meet semantics if there is a semantics (il, 3) 
for L which satisfies condition (nat-A) above. 

2.4 Existence of Natural Meet Semantics 

Every bounded lattice L = (L, V, A, T, _L) admits a totally separating separating 
natural meet semantics. 

Proof. For each r G L\ E, associate a distinct element x^, and then put il = 
{xr I T G L}. Define 3 : L — ?> il by r i-A- {xcr | cr < r}. It is easy to see that 
this semantics satisfies the condition (sem-A), and it is totally separating by 
construction. □ 

2.5 The Role of Distributivity 

In the systems ALE 0, TFS and the ACQUILEX LKB among others, 
the type hierarchies are not required to be distributive. In pp. 15-17], Carpen- 
ter argues that type hierarchies should not be distributive. On the other hand, 
the CUF system 0 requires a distributive hierarchy. We shall now attempt to 
sort out how these apparently disparate points of view can exist. 

Unification itself makes no use of the join operation; as outlined above, only 
the meet operation is used. Even though a bounded finite meet semilattice will 
necessarily have joins of any elements Ch. 1, Sec. 3, Lem. 14], these joins 
are not used in the unification process, and so it is unnecessary to assign any 
computationally significant semantics to them. In short, only the natural meet 
semantics is relevant. Since the concept of distributivity can make sense only in 
a context in which there is a meaningful join as well as meet, distributivity plays 
no formal role in these contexts. As shown in 2.4, it is always possible to assign 
a complete meet semantics to a finite bounded lattice, regardless of any further 
properties such as distributivity or modularity. 

CUF, on the other hand, was designed to support disjunctive unification jS|, 
in which alternative parses are support via a meaningful semantics on the join 
operation. This indeed requires a distributive lattice. 

2.6 Natural Join Semantics and Natural Semantics 

A type hierarchy Sj = (L,(il, 3)) satisfies the natural join semantics if the 
following condition is satisfied for every ti,T 2 G L: 

3(ti V T 2 ) = 3(ti) U 3(t 2) (nat-V) 

The type hierarchy Sj is said to satisfy the natural semantics if it satisfies both 
condition (sem-A) and (sem-V). Similarly, a lattice L is said to admit natural 
semantics if there is a semantics for L which satisfies both (sem-A) and (sem-V). 

The definitions of separating and totally separating natural semantics are 
analogous to those which apply to meet semantics. 
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2.7 Birkhoff-Stone Representation of Distributive Lattices 

A lattice L is called a ring of sets if there is a set S (called the basis) with the 
property that every x € L is a subset of S, and, furthermore, that for every 
x,y€L, xVy = xUy and x Ay = xC\y. In other words, join is union and meet 
is intersection. In the case of a bounded lattice, it suffices to take T = S. Such a 
lattice has a “built-in” semantics; let S be the universe of objects, with the set 
of objects associated with an element x G L just x itself. 

The representation theorem of Birkhoff and Stone [El Ch. 2, Sec. 1, Thm. 
19] states that a lattice is distributive iff it is isomorphic to a ring of sets. It 
is furthermore possible to require that the ring be nonredundant, in the precise 
sense that if si, S 2 G S, then there is some x G L which contains one of {si, S 2 }) 
but not both. (Otherwise, one of {si, S 2 } could be removed, with the resulting 
lattice isomorphic to the original one.) 



2.8 Existence and Algorithmic Aspects 

Let L = (L, V, A, T, _L) be an arbitrary finite bounded lattice, and let n = 
Card(L) denote the cardinality of L. 

(a) L admits a totally separating natural semantics iff it is distributive. This 
question is decidable in time 0{n^). 

(b) It is decidable in time O(n^) whether or not L admits a natural semantics. 

(c) //L admits a natural semantics, it may be constructed in time 0(n^). 

Proof. The characterization is a consequence of the Birkhoff-Stone representa- 
tion theorem. The 0{n^) algorithm arises from the fact that the distributive 
law involves triples of elements; it is only necessary to check each such triple 
in turn. The question of establishing a natural semantics involves construction 
of an appropriate quotient lattice via the congruence defined by nondistributive 
components. The details are not presented in this paper. The important point, 
however, is that the construction may be carried out in deterministic polynomial 
time. □ 



3 Open Specification of Type Hierarchies 

In open specification, a constraint-based specification replaces a complete de- 
scription of the hierarchy. At least one existing system, CUF jZ], has taken this 
approach. In this section, some basic issues surrounding such specifications are 
examined, including condition for existence of a model, size of models of models, 
and characterization of canonical models. Because the principal interest in such a 
framework lies within the domain of distributive hierarchies, attention has been 
restricted to that case. 
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3.1 Augmentation and Interpretations 

A set of clean types is one which contains neither T nor _L. For such a set, 
define Augj^(P) = PU {_L}, Aug-p(P) = PU {T}, and Aug(P) = PU {_L, T}. An 
interpretation of P is a pair / = (L, /) in which L = (P, V, A, T, _L) is a bounded 
distributive lattice and / : Aug(P) — >■ P is a function which is subject to the 
conditions that /(P) = P and /(T) = T. The collection of all interpretations of 
P is denoted Interp(P). 



3.2 Open Specifications 

Let P be any finite set of clean types. The system of constraints over P is defined 
to be the smallest set Constraints(P) satisfying the following conditions. 

(c-<): If Ti G Aug-p(P), T2 G Augj^(P), then (n < T2) G Constraints(P). 

(c-A): If t G Aug(P) and S' C P is nonempty, then (/\ S = t) G Constraints(P). 

(c-V): If T G Aug(P) and S C P is nonempty, then (V S = r) G Constraints(P). 

(c-yf): If Ti G P, T2 G Aug(P), then (n yf T2) G Constraints(P). 

(c-Atom): If t G P, then Atom(r) G Constraints(P). 

An interpretation / = (L, /) satisfies the constraint ip G Constraints(P), 
written I |= p, if the appropriate rule given below is satisfied. 

(sat-<): / N (n < T2) iff fin) < /(T2). 

(sat-A): / h (A S = r) iff I o' G S} = /(r). 

(sat-V): / 1= (V S = r) iff \/{f{a) | cr G S} = /(r). 

(sat-yf): / ^ (n A T2) iff /(ti) A /(t2)- 

(sat-Atom): I |= Atom(r) iff /(r) is an atom in L (i.e., a < f{r) implies a = /(r) 
or (j = P). 

An open specification is a pair (P, <?), in which P is a finite set of clean 
types and <P C Constraints(P). A model of (P, is an interpretation I = (L, / : 
Aug(P) — >■ L) for which I \= p holds for each p G <P. A model (L, /) of (P, <P) is 
finite if P is a finite set. The set of all models of (P,d>) is denoted Mod(P, ^). 

It is convenient to partition the constraints into two classes. Elements of 
the first three categories listed above ({<,A,V}) are called positive constraints, 
while elements of the last two ({A, Atom}) are called negative constraints. If 
<!> C Constraints(P), then denotes the positive constraints over P, and ‘P~ the 
negative constraints. Similarly, Constraints^ (P) (resp. Constraints' (P)) denotes 
the set of positive (resp. negative) constraints contained in Constraints(P). 

It is important to note that other forms of constraints may be realized easily, 
even though they are not explicitly in this set. For example, a constraint of the 
form (ti = T2) is equivalent to the pair of constraints {(ti < T2), (t2 < ti)|. 
Likewise, the constraint (ti < T2) may be viewed as an abbreviation for the set 
{(ti < T 2 ),(ti a T2)}. 
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3.3 Canonical Models 

In general, an open specification has a multitude of models. This situation arises 
for two reasons. First, it is always possible to augment a model with all sorts 
of extraneous types, without violating any constraints. For example, if P\ C P 2 , 
and is any set of constraints dealing only with symbols in P \ , then any model 
of (^ 2 ,^^) defines a model of {Pi,<l>). Second, a model may impose a constraint 
which is not mandated by the specification. For example, for the specification 
({a, 6}, 0), a model in which (a = b) holds is quite permissible, even though it is 
not required to hold in all models. 

The question thus arises as to whether, for a given open specification (P,<P), 
there is a canonical model which (a) does not introduce any extraneous types, 
and (b) does not force any constraints not shared by all models. The answer is 
“yes,” provided that ^ does not contain any constraints of the form Atom(r). 
Such a canonical model is the initial model, described as follows. 

Let (P, be an open specification, and let (Li, /i : Aug(P) — t> Li) and 
(L 2 , /2 : Aug(P) — 7> L 2 ) be models of (P, <P). A morphism h : Mi — t> M 2 is just a 
bounded lattice homomorphism /i : Li — >■ L 2 with the property that /2 = ho fi. 
A model N = (K,z) for (P,d>) is initial if, for every model M = (L,/) of 
{P,L>), there is a unique morphism h : N ^ M. It is a standard result from 
category theory that such initial constructions, when they exist, are unique up to 
isomorphism ca Chap. 4, Sec. 7]. Thus, we may speak of the initial or canonical 
model. In this paper, the term eanonieal model will be taken to be synonymous 
with initial model. We prefer the term canonical model because it conveys a 
sense of its representational power, while initial model conveys a purely algebraic 
property. 

3.4 The Bounded Distributive Lattice of Crowns over P 

The definition of canonical model is abstract; however, it is useful to have a 
basic understanding of its structural and combinatorial aspects. To this end, we 
provide a concrete construction, beginning with the situation (P, 0) , in which 
there are no constraints. In that case, the canonical model is represented by the 
distributive lattice whose elements are T, together with all expressions built up 
from elements of P using A and V, subject to equivalence via the distributive 
laws. The idea of how such expressions are represented parallels closely the idea 
of disjunctive normal form (DNF) from propositional logic Uni pp. 48-49]. Just 
as any formula in propositional logic may be converted to one in DNF (using the 
distributivity of the corresponding operations), so too may any expression in a 
distributive lattice be converted to an equivalent one in such a form. Specifically, 
we work with expressions of the following form, 

(oii A Oi2 A .. A aim) ^ (®2i A 022 A .. A 0271-2) V .. V (omi A Om2 A .. A Omum) 
in which the a^’s are elements of P. 

Such representations of individual elements in a lattice can easily be confused 
with expressions involving the lattice operations A and V. Therefore, an alternate 
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notation is adopted. If Ci = {oy | 1 < J < nt}, then the set representation of 
the above expression is {Ci, C2, Cm}- Thus, in the set representation, each 
element other than T is represented by a set of subsets of elements of P. 

There is one further problem; namely an element may have more than one 
representation. For example (a A b A c) V (a A b) and (a A b) are equivalent. 
To remedy this, we must disallow expressions in which the types in one of the 
disjuncts is a subset of those in another. This leads to a representation using 
co-chains^ or crowns. 

Let T = {Cl, C 2 , .., Cn} be a set of subsets of P. Call T a crown if for no two 
distinct indices i and j is it the case that Ci C Cj. Let Crown(P) denote the set 
of all crowns of P, and let CrowriT(P) denote Crown(P) U |T}. 

It is easy to see that the elements of CrowriT(P) form a bounded distributive 
lattice, under natural operations. Specifically, the empty set 0 is the bottom 
element T of the lattice, and further operations are defined as follows. 

(cr-V): (Cl, C2, .., C„} V {^1,^2, = 

Crownify(|Ci, C2, .., C„, T>i, £>2, Dm})- 

(cr-A): (Cl, C2, .., C„} A {^1,^2, D^} = 

Crownify(|Ci (1 Dj \ 1 < i < n and 1 < j < m}). 
(The operation Crownify converts its argument into a crown by removing all 
subsumed subsets.) The lattice so constructed shall be denoted CrownLat(P). 

3.5 Theorem 

For any set P, (CrownLat(P), l : P CrowriT(P)), with l : t {{t}}, is a 
canonical model over (P, 0 ). 

Proof. The proof is a standard free algebra construction O Ch. 1, Sec. 5]. □ 



3.6 Example 

Let P = {a, 6}. Then Crown(P) = {{0}, {{a}}, {{6}}, {{a, 6}}, {{a}, {6}}}. The 
corresponding free lattice is leftmost in Fig. 1. (The other two lattices will be 
considered in 3.10 below.) This example also illustrates why there is a distinct 
element assigned to be T, rather than just taking T to be the largest crown 
consisting of all maximal subsets of P. The condition that T is the join of all 
elements in P is a constraint, and not a condition which must be satisfied in all 
models. 



3.7 Combinatorics of the Crown Construction 

The size of the set of crowns grows very rapidly. For P = {a, 6, c}, 

CrowriT(P) has eighteen elements. Specifically, Crown ({a, 5, c}) = {{0}, 

{{«}}, mh {{c}}- iWMh {{a,c}}, {{6,c}}, {{a,6,c}}, {{a},{b}}, 

{{a,b},{a,c}}, 

{{a,6},{6,c}}, {{a,c},|&,c}}, {{a}, {6}, |c}}}. 
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T 



{{»}} 



{{!>}} 



{{a, b}} 

{»} = i 



{{a}, {b}} = T 




{{a,b}} = m = ± 



{{b}} = {{»},{!>}} = T 



{{a}} = {{a, 6}} = {0} = -L 



Fig. 1. Three canonical lattices over P = {a, b}. 



In general, the number of elements in Crown(P) is much greater than 2", 
although less than 2^ . Thus, explicit construction of the canonical lattice for 
(P,^) is generally highly impractical. This is true even when 0, as shown 
below. Generally, although the canonical lattice will be smaller, it will still be 
very large. 



3.8 Coalescing Constraints and Quotient Lattices 

We now turn to the problem of characterizing the canonical lattice over an open 
specification (P, <?) in which is not empty. A congruence on L = (L, V, A, T, T) 
is an equivalence relation = on P with the property that whenever x\ = X2 and 

yi = 7/2 hold, then xi V yi = X2 V 7/2 and x\ /\y\ = X2 A y2- It is well known 

that if (and only if) = is a congruence relation, the equivalence classes P/= of 
L form a lattice L/ = under the induced operations nap. 21]. 

It is straightforward to show that any set of positive constraints on P de- 
fines a congruence in a natural way. For r G P, define ECrown(r) = {{r}}, 

with ECrown(T) = {0} and ECrown(T) = T. Then, for ^ C Constraints^(P), 
define =<p to be the finest congruence relation on CrowriT(P) which includes the 
following identifications. 

(=«>-<): If (ti <T2) with ti,T 2 G P, then {{ti},{t2}} =4, {{T2}}. 

If (T < r) G then {{r}} =4 T. 

If (r < T) G then {{r}} =4 {0}. 

{=4~A): If {f\S = t) € then {S'} =4 ECrown(r). 

(=,jj-V): If (V -S' = r) G then {{a} | tr G S| ECrown(r). 

Define /=^ : Aug(P) CrowriT(P)/ =<f by r [{{r}}]=^, T [{0}]=,^, and 

We are now in a position to crystallize when and how an open specification 
has a model. For technical reasons, we first address the case in which there are 
no constraints of the form Atom(r). 
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3.9 Theorem — Characterization of Satisfiability 

Let (P, <P) be any open specifieation with the property that <P eontains no eon- 
straints of the form Atom(r). Then the following eonditions are equivalent. 

(a) Mod(P, is nonempty. 

(b) {P,L>) has a finite canonical model, which is given by (CrowriT(P)/ 

, f=4.)- 

(c) =<f contains more than one equivalence class and, for each (ti T 2 ) € 

, {{ti}} and {{T 2 }} lie in distinct equivalence classes of =^+ . 

Proof. First of all, assume that <P contains only positive constraints. If con- 
tains only one equivalence class, then _L and T must collapse to the same element, 
which violates the definition of a bounded lattice. Thus, no model can exist. If 
there is more than one equivalence class, then the free algebra exists, as specified, 
using standard techniques for the construction of free lattices ^ Ch. 1, Sec. 5]. 

For the general case, note that an inequality constraint of the form (ri 
T 2 ) does not alter the canonical model, but it may prevent its existence. More 
specifically, one first computes the canonical model for the positive constraints, 
and then tests to see whether the inequality constraints are satisfied in that 
canonical model. The condition identified in (c) exactly recaptures this situation. 
□ 

3.10 Examples 

As in 3.6, let P = {a, 5}, but now let <P = {(\/{a, &} = T), (/\{a, 6} = _L)}. 
The corresponding canonical lattice is in the middle of Fig. 1. If we add the 
constraint (a b) to d>, this does not change the canonical model at all, since 
this constraint is already satisfied. Adding the constraint (a < _L) results in the 
rightmost lattice. 

3.11 Dealing with Atomic Constraints 

The results of 3.9 do not address atomic constraints (i.e., Atom(r)), because the 
situation surrounding such constraints is much more difficult. For example, let 
P = {a, 6}, and let = {Atom(a), Atom(6)}. At first glance, it might appear that 
the leftmost lattice in Fig. 2 is an initial model for (P, (L). However, this is not the 
case. The rightmost lattice in Fig. 2 also satisfies these constraints, yet it is not a 
homomorphic image of the one on the left, since on the left [{{a}}] A [{{6}}] = -L, 
yet on the right [{{a}}] and [{{&}}] are the same element, and so this value is 
also the meet. This type of behavior is typical of constraints such as Atom(r); 
initial models often do not exist. 

From 3.9, we can deduce that in the absence of atomic constraints, when 
there is a model, there is a finite model. This is an extremely important result, 
since infinite hierarchies pose all sorts of conceptual and computational difficul- 
ties. Fortunately, this finite result remains valid even in the presence of atomic 
constraints, as shown below. It result must not be viewed as trivial or frivolous. 
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T 



{{»}.{*>}} 




= {0} = -L 



T 

{{o},{i’}} = {{»}} 

{{6}} = {{a. 6}} 

-L J{0} 



Fig. 2. Two candidates for initial lattices with constraints $ = {Atom(a), Atom(fe)}. 



There exist many classes of lattices (e.g., modular lattices) for which initial mo- 
dels may be infinite. See H3 Ch. 1, Sec. 5, Exer. 12]. From a computational point 
of view, it is indeed fortunate that the distributive model of a type hierarchy 
does not share this shortcoming, even in the presence of atomic constraints. 



3.12 Theorem — Finiteness of Models 

Let (P, <P) he a finite open specification. 

(a) If (P, <L) has a model, then it has a finite model. 

(b) If (P, <I) has a canonical model, then this initial model is finite. 

Proof. In both cases, the proof depends upon the observation that in any (not 
necessarily finite) distributive lattice, the sublattice generated by a finite set is 
itself finite. This is easily seen from the crown construction (3.4); under distri- 
butivity, there is only a finite number of distinct expressions which may be built 
up from a finite number of elements. Thus, if (P, ^) has an infinite model, just 
extract the sublattice generated by PU{T, T}; it is guaranteed to be finite. This 
also shows that any initial model must be finite, since the part not generated by 
P U {T, T} must be extraneous. □ 



3.13 Complementation 

In many frameworks, including that of CUE P|, complementation is an implicit 
operation. Thus, every type r has a complementary type r which satisfies the 
conditions r V r = T and r A t = T. For reasons of space limitation, we have 
not developed explicit results for this extended framework. However, with minor 
modifications, all of the theoretical results of this section carry through in the 
presence of complementation. Particularly, in the crown construction of 3.4, the 
sets of conjuncts (i.e., the Cfs) involve symbols of the form a^- and of the form 
ojj, subject to the condition that no set may contain both a and d. Needless 
to say, the size of the canonical model is even larger in the presence of implicit 
complements. In a general context, further information on the effect of including 
implicit complementation may be found in m- 
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4 Basic Computational Techniques 

The results on the size of the canonical model identified in 3.7 suggest that 
explicit construction of this lattice from an open specification is unrealistic, 
except in the most limited of circumstances. Fortunately, it is not generally 
necessary to provide an explicit construction of the whole hierarchy. Often, it is 
sufficient to know whether a given open specification admits a model; that is, 
that it is not inconsistent. However, even for that question, we have the following 
rather negative result, which is proven in d 

4.1 NP-Completeness 

The question of whether a finite open specifieation {P,'T) has a model is NP- 
complete, even when attention is restricted to positive sets of constraints. □ 
Despite this negative result, it is important to identify basic solution techniques. 
In practice, NP-complete problems are dealt with effectively all the time using 
a a variety of strategies. Later, we shall look at some of these approaches, but 
first some basic results must be established. 

4.2 Two-Element Interpretations 

The two-element lattice, denoted 2, contains {T,T} as its only elements. It is 
trivially distributive. 

Let (P,tl>) be an open specification. A two-element interpretation for (P,d^) 
is an interpretation of the form (2,/); thus f : P ^ {T,T}. The set of all 
two-elements interpretations for (P,^) is denoted lnterp2(T’, The set of two- 
element models is denoted Mod2(H, ^). 

4.3 Adequacy of Two-Element Models 

A positive open specification (H, has a model iff it has a two-element model. 

Proof. Let (L, g : P ^ L) he a model of {P,T>). In view of the Birkhoff-Stone 
representation theorem (2.7), we may take L to be a ring of sets. Let s be any 
element from this ring which appears in some members of L, but not in all. 
Then, it is easily verified that (2, ps : P ^ {T,T}) with : r H> T if s S g{T) 
and (7s : T I— >■ T otherwise, is a two-element model. The converse is trivial. □ 

4.4 Limitations of Two-Element Models 

Two-element models are sufficient if one is interested only in positive constraints. 
However, they are inadequate for negative constraints. For example, let P = 
{ti,T 2,T3}, and let = {(n T2), (n T3), (t 2 y^ T3)}. Then (P,<P) cannot 

have a two-element model, because ri, T2, and T3 must be distinct elements in 
the underlying lattice. Fortunately, it is possible to combine two-element models 
to obtain larger models which satisfy both positive and negative constraints, as 
we now show. 
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4.5 Product Interpretations and Representations 

A key notion in the construction of models for a general set of constraints is that 
of the product model. Let (P,d^) be an open specification, and let I = {(L^, gj : 
Aug(P) — >• Lj) I j G J} be a finite set of interpretations of (-P, ^)- Define the 
product interpretation to be 9 ■ Aug(P) WjfijLj). 

is the product lattice, which is easily verified to be distributive. 

Conversely, given an interpretation / = (L, (/ : Aug(P) — >• L), it is possible to 
construct a product interpretation which satisfies the same constraints. In view 
of 2.3.3, L may be taken to be a nonredundant ring of sets. Let S be the basis 
for such a ring. For each s G S, define a two-element semantics Is = (2, 9s '■ 
Aug(P) — >■ {_L, T}) with 3 s : T I— >■ T if s G g{T) and r i— >■ _L otherwise, for t G P. 
Then HseS satisfies the same constraints as I. 

We are now able to provide the main characterization theorem for existence 
and structure of models of open specifications. 

4.6 Characterization of Models of Arbitrary Open Specifications 

An open specification {P, I>) is satisfiable iff there is a finite nonempty family 
I C lnterp 2 (P , satisfying the following conditions. 

(a) Each ip G is satisfied by every I G I. 

(b) Each ip G of the form (ti T2) is satisfied by least one I Gl. 

(b) Each p G of the form Atom(T) is satisfied by exactly one I G I. 

□ 

4.7 Model Characterization via Satisfiability in Propositional Logic 

We now turn to the question of computing two-element models for (P, <?) . The 
simplest and most direct approach is to reduce the problem to one of satis- 
fiability in a propositional logic. Associate with P a propositional logic whose 
propositional letters are {tr | r G Aug(P)}. To each p G Constraints(P) is asso- 
ciated a propositional formula according to the table below. For a set ^ of 

constraints, define S'(^) = I ‘P G ^}- 



Constraint p 


Associated Logical Formula 


n < T2 


^Ti ^T2 


< 

II 


Xr^ VXr2 ^ ^ 


I'- 

ll 

C 

s li 

< 


Xr^ f\Xr2 A . . . Ati-„ Tr 


n t2 


^T2) 


Atom(T) 





Each stable interpretation I of the propositional logic (i.e., a truth assignment 
to each proposition, with ty true and false) naturally defines a two-element 
interpretation = (2, gj : Aug(P) —>■ {T,T}) of P via 5 / : t >->• T if is 

true in I, and r 1 — >■ T otherwise. Thus, the two-element models of (P, I>) are in 
natural bijective correspondence with the stable models of^{<P). 
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4.8 Queries 

Given an open specification we may wish to know whether certain other 

constraints are implied by this specification. For example, if <1> is the partial spe- 
cification of a hierarchy, we may wish to know whether forces ti = T 2 to hold. 
A query is a question of the form “Does |= (/J hold?” with C Constraints(P) 
and tf S Constraints(P). This question could can be posed formally as the two 
queries h (d < '^ 2 ) and <P h (f 2 < Ti). Unfortunately, the complexity of 
answering queries is co-NP-complete HU Sec. 7.1]. 

4.9 Theorem Complexity of Query Processing 

Let P he any finite set. The question of whether \= ip for <L> C Constraints(P) 
and p C Constraints(P) is co-NP-complete, in the size of P. 

Proof. First of all, note that <l> p holds iff ^ U {^p} is unsatisfiable, where 
-<p is the negation of the constraint p, with the obvious semantics. The set 
Constraints(P), together with logical negations of its elements, will be called the 
set of extended constraints over P. 

Now deciding whether a set of extended constraints is satisfiable is at least 
as difficult as deciding whether or not a set of ordinary constraints (i.e., a subset 
of Constraints(P)) is satisfiable. Thus, in view of 4.1, it is NP-hard. On the other 
hand, it is also in NP, since we may guess at a solution and then test it in 
linear time. Thus, the problem of deciding the satisfiability of ^ U {p} is NP- 
complete. Consequently, the problem of deciding the wrjsatisfiability of such a 
set is co-NP-complete. □ 

5 Efficient Consistency Checking Using Horn Clauses 

The results presented in the previous section paint a fairly negative picture of the 
tractability issues surrounding maintenance of type hierarchies which are openly 
specified. In particular, it may not be feasible to conduct a complete check of the 
admissibility of a specification. Under such circumstances, there are two tacts 
which may be taken. First, one may look for an efficient strategy which solves 
a limited number of problem instances completely. Second, one may look for an 
efficient strategy which works on any problem instance, but yields only partial 
information. In this section, we take the latter approach. 

Horn clauses form an important class of sentences for a variety of reasons 
pf|. Particularly, in propositional logic, they admit very efficient inference me- 
chanisms. While the best known inference algorithms for general propositional 
logic run in exponential time in the worst case. Horn-clause inference may be 
performed in linear time p, HHj. In that which follows, we show how to use 
a Horn-clause formulation to detect forced equivalences in open specifications 
(that is, constraint sets for which ^ ^ (ti = T 2 )). The result is not a mere 
query mechanism, but a procedure which generates a list of such equivalences. 
This technique improves substantially upon an earlier one presented in pi 41 Sec. 
2.1], in simplicity, in improved computational complexity, and in extensibility. 
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5.1 Horn Clauses 

In a propositional logic, a Horn clause is one in which at most one of the literals is 
positive. Usually, we will write the Horn clause ->piv-ip 2 \'..\'~'Pn^Q in rule form, 
as pif\p 2 f\..r\pn g. If there is no positive literal, we will write pif\p 2 f\..r\pn => 
F, with F representing the proposition which is always false. Simple clauses 
consisting of one positive literal (e.g., p) are referred to as facts; the atom name 
itself will represent such clauses. The empty clause will be represented by F. 

For any set of Horn clauses, define Facts(iF) to be the set of all facts which 
are semantic consequences of 'P. Facts(iF) may be computed in time which is 
linear in the size of the clause set P, Plj. 

We will also work with conjunctions of Horn clauses, as though they were 
clauses themselves. Thus, a formula such as {p/\q tas) is to be regarded as an 
abbreviation for the conjunction {p/\q => r)r\(pr\q => s). 



5.2 Exclusive-or Constraints 

As noted above, inference on sets of Horn clauses is very fast; the best algo- 
rithms are linear in the size of the input set. Since the problem of determining 
satisfiability of an open specification is NP-complete (see 4.1 above), we cer- 
tainly cannot expect Horn clauses to be a vehicle for the complete description 
of open specifications. Rather, some additional forms of representation must be 
employed to recapture completely such specifications. 

A propositional formula of the form p(Bq is called an exclusive-or constraint, 
or xor- constraint, for short. The formula p (B q is equivalent to (pvq)A{-<pv->q). 
Such a constraint expresses the restriction that exactly one of two alternatives 
must be true. In the work reported in this section, the constraints identified in 
the table of 4.7 will be re-expressed using a combination of Horn clauses and 
xor-constraints. Such a representation carries the advantage that the “tractable” 
part of the representation (the Horn clauses) is completely separated from the 
“intractable” part (the xor-constraints). Effective computational techniques may 
then focus on the Horn part, looking for inconsistencies in the specification. 
While such a technique will not find all inconsistencies, it can find many, as we 
shall see. 



5.3 The Horn Clauses Associated with an Open Specification 

Let P be any finite set of clean types. Define two Aug(P)-indexed sets of proposi- 
tions, as follows: Prop.j^(P) = {p.^ I t S Aug(P)}; Propj^(P) = {q.^ I t S Aug(P)}. 
Prop^(P) denotes Prop.jy(P) U Propj|(P). Relative to a two-element interpreta- 
tion I = (2,/), think of pr as representing the statement /(t) = T, and q,- 
representing /(r) = T. 

Now let be a set of constraints over P. For each p G <P, associate two sets 
of Horn clauses, over Prop.|^(P) and Propj|(P), according to the table below. 
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Rulesivdv?}) 


Rulesjidv?}) 


{ti < Ta) 


(PTI ^ PT2) 


Pr 2 ^ Pri 


(Vr=iP = r) 


(pTl ^ Pt)j (pTn ^ pr) 


(^Ti ^ 'It) 


(Ar=i p = t ) 


(pr ^ Pri)) (pr ^ Pr„)j 

(Pri Apr 2 A..Apr„ Pr) 


(pri pr)) ••) (Pr„ Pr) 


(p A p) 


(pnApr2 F) 


(pnApr2 ^ F) 


Atom(T) 


(Pr) 


(pr ^ F) 



For a set C Constraints(P), define 

Rulesiy(<l>) = {(Pt), (P_l ^ F)} U ( U Rules^^dv?})) 

Rules4|(<l>) = {(q_L), (qr ^ F)} U ( |J Rules4|({(^})). 

These sets of clauses are called the -fl'-rafes (resp. ]}--rules) for 'P. The notation is 
suggestive of the semantics of these rules. The fl-rules express closure conditions 
on the elements of P which must be true in a two-element model I = (2, /), while 
the l|-rules express similar conditions on elements which must be false. Consider 
the generic constraint (V^i P = '^)- holds, several things are implied. First 
of all, for each i, Ti < r. Thus, for any i, if /(rj) = T, then it must be that 
/(r) = T also. This condition is recaptured by the rule {pn ^ pr)- Similarly, 
if /(r) = T, then /(t^) = T must hold as well, and this is recaptured by the 
rule (qr pr,)- Furthermore, if /(tj) = T for each i, then the join condition 
mandates that /(r) = T also; this is recaptured by the rule (q^ Aqra A..Aqr„ 
qr). Note that there is no corresponding 'ff-rule in this case. The situation for a 
constraint of the form (A”=i P = t) is completely analogous. 

The rules in {(pr), (P_l F)} and in {(qj,), (qr F)} are called bound 
rules, because they assert that /(T) = T and /(T) = T, respectively, for any 
two-element model (2,/) of (P,P)- 

In addition to the f|"-rules and l|-rules, there are xor-constraints which assert 
that for any t G P, exactly one of /(r) = T and /(r) = T must hold for 
any given two-element model (2,/). The XOR constraints, rule set, and total 
representation are defined as follows. 

XOR(P) = {pr © Pr I r G P} 

Rules(<?) = Rules^(<?) U Rules4|(<?) 

Total Rep(P, <?) = Rules(<?) UXOR(P) 

Finally, given a two-element interpretation I = (2,/) of {P,P), define the fact 
set of I as FactSet(/) = (p^, | x G /“^(T)}U{pa; | x G /“^(T)}. Then, using the 
semantics which have been outlined above, it is easy to establish the following 
alternative to 4.7. 
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5.4 Characterization of Two-Element Models 

Let {P,’L) be an open speeification, and let I = (2,/) G lnterp 2 (P, <?). Then 
I G Mod 2 (P,<?) iff the set FactSet(/) U Total Rep(P, of propositional formulas 
is satisfiable. □ 

5.5 Example 

Let P = {kj I 1 < z < 6}, and let <L> = {(V{ki>K 2 } = (V{'^ 2 ,K 3 } = 

'^s), (V{«G«^3} = K 4 ), (A{^«2,K3} = K6),(A{«i)«e} = -L), ((zt 5 < K 4 )} ■ The 
table below shows the associated rules. 



Rules.|^(^) 


Rules4(<?) 


pKi pg,^ApK5 




Pk2 ^ P«5 


^K3 ^ P'46 


PK3 ^ Pk4^Pk5 


Pk4 P'ClAqK3Aq/tg 


PKS pK4 


Pk5 ^ P'«lAqK2AqK3 


pKe pKjApKg 


^Kl^^K2 ^ P/«5 


pK2ApK3 pKg 


q«2AqK3 PfCs 


pKiApKe ^ P_L 


q/tiAqK3 q/44 


PT 


pT ^ F 


p_L ^ F 


P-L 



Note that the rules (p_L ^ P«i), (P_l ^ P^e). (^/^i Pt), and (q^g qr) 
are not included in the table, even though they are formally members of the 
appropriate rule set. Since pj_ is always false, and qr is always true, they are 
trivial tautologies, and so may be omitted. 

Next, define the two-element interpretations Ij = (2, fj) for 1 < j < 7 
according to the following table. 



j 


/-AT) 


/-AT) 


1 


{k3,K4,K5} 


{ki,K2,Kq} 


2 


{k2,K4,K5} 


{ki,K3,Kq} 


3 


0 


{Ki,K2,K3,K4,K5,Ke} 


4 


{ki,K2,K3,K4,K5,Kq} 


0 


5 


{ki,K4} 


{k2,K3,K5,Kq} 


6 


{ki,K2,K3,K4,Ks} 


{«e} 


7 


{ki,K2, K3, K4, K5, Kq} 


0 



Ii, I 2 , and Is, are easily verified to be models of {P,d>). Indeed, it is not difficult 
to see that they are the only two-element models. On the other hand, I 4 fails 
to be a model, since (p^iAp^g p_r) would then mandate that _L G 
an impossibility. I 5 fails to be a model of (P, ^), since the rule (p^i =4> p^^Ap^^) 
mandates that K 5 G /^^(T), contradicting G /^^(_L). Similarly, Iq is not a 
model of (P, ^), since the rule (pKaApKg pKg) mandates that kq G /A^(T), 

contradicting kq G Finally, ly is not a model, since the rule (p^iAp^g 

qj_) requires that at least one of {k,\, kq] lie in ff~^{±). 
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Now let <P' = (1> U {(k 4 ^ K5)}. Then TotalRep(P, ^') = TotalRep(P, <?) U 
{(p„4ApK5 => F),(q„4AqK5 F)}. It is easy to see that (P,^) has no model. 

Indeed, {(q^^ q„i Aq^2^q„3), (q^iAq„3 ^ q^^)} |= (q,.5 q«4)> and since 

(q«4 q^g) also holds, this means that for any two-element model (2,/), K4 G 
/“^(_L) iff K 5 G /“^(_L). Thus, /(K 4 ) = /(ks). This is impossible, hence there 
can be no model of (P,^)- 

5.6 Static Consistency Checking 

Let (P, be an open specification. For any set X C Prop^(P), Facts(X U 
Rules(l?)) may be computed very efficiently — in time proportional to the size of 
The idea behind static consistency checking is to compute Facts(X URules(<?)) 
for each member X of a judiciously chosen set of subsets of Prop^(P). From this 
computation, many properties of solutions may be detected. One such example 
is provided by (P, l?) of 5.5. The condition (K 4 = K 5 ) holds in every model, as 
the failure of (P, to have a model confirms. In 5.5, this failure was shown 
via a direct proof, which essentially posed (k 4 = K 5 ) as a query. To check each 
such condition separately would require a great deal of computational resour- 
ces. With a static consistency check, on the other hand, we can identify a large 
number of such conditions at one time. We now develop the machinery to per- 
form such checks systematically. First, observe the following result, which follows 
immediately from the definition of the fact set. 

5.7 Utility of Fact Closure 

Let (P, be an open specification, and let X C Prop^(P). Then Rules(<?) 
^ i/\X) => (/\Facts(Al U Rules(<?))). In other words, the process of computing 
Facts(X U Rules(<?)) may essentially be viewed as one of applying the rules in 
Rules(^) to X . (Note: /\ denotes logical conjunction here, not lattice join.) □ 

5.8 Aggregate Complexity for Fact Closure 

Let {P,<L) be an open specification, and let S be a set of subsets of Prop^(P). 
Then the set of sets {Facts(S' U Rules(<?)) | P G S} may be computed in time 
0 {n -F r-s -F r-log(r)), with n the cardinality of P, s is the number of set in S, 
and r is the sum of the lengths of the rules in Rules(<?)). 

Proof. The proof rests largely upon results found in Pj and m- The process is 
broken into two steps. There is a total of 2(n -F 2) propositions in Prop.|^(P) U 
Propj|(P). Assign each proposition a natural-number tag in {0, .., 2 n + 3}. This 
takes time 0 {n) . Next, sort the propositions in each antecedent set of each clause. 
This takes time 0(r-log(r)). 

After this preconditioning, the computation of each Facts( 5 'U Rules(<?)) takes 
just 0 {r) time, using the techniques in the above-cited references. The total time 
for all elements of S is thus &{s-r). Combining these, the running time for the 
entire algorithm is 0 {n + r-log(r') -F s-r). □ 
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5.9 Singleton Associations 

Let {P,<P) be an open specification, and let p € Prop^(P). The singleton associa- 
tes of p, denoted SingAsc(p), is just Facts({p} U Rules(^)). In words, SingAsc(p) 
is the set of all propositions in Prop^(P) which can be deduced from p alone, 
using the rules in Rules(^). 

5.10 Example 

Let (Pj'P) be as in 5.5. Here are the singleton associates. 



X 


SingAsc(pa,) 


SingAsc(q3-) 


Ki 


{p/^1 7 PK.4 7 P«5 } 


{P«i } 


K2 


{p/^2 7 PK.4 7 P /^5 } 


{Rk 2 J pKel 




{Pk 3 ’ P «4 ; P «5 } 


{qK 3 J pKei 




{PK4} 


{^/«1 7 7 7 } 


K5 


{pK 4 > pKsI 


7 ^«i 2 7 *^K ;3 7 ^^47 } 


Ke 


{P /^2 7 P /*3 ’ P/^6 } 


{P^el 


± 


{P±} 


{q^} 


T 


{Pt} 


{qr} 



The information that (k 4 = K 5 ) is easily recovered from these data. In- 
deed, note that G SingAsc(pK 4 ) and that q ^4 G SingAsc(pK 5 ). In light of 5.7, 
this means that both (q„^ q^s) and (q^^ q^ 4 ) are logical consequences of 

Rules(^). Thus, for any (2,/) G Mod2(^), /(K 4 ) = T iff /(K 4 ) = T. Thus, it 
must be the case that /(K 4 ) = /(K 5 ). We now develop a means of discovering 
such associations systematically. 

5.11 The Static Equivalence 

Define the relation on Prop^(^) by p q iff g G SingAsc(p). The transitive 
closure of is denoted by The equivalence relation places into a single 
equivalence class all elements which lie in the same cycle in Formally, p q 
iff Pdilq and qP^P- 

In the above example, only q ^4 q^s- 

5.12 The Complexity of Determining Static Equivalence 

Let (P,<P) be an open specification. Then, with n, r, and s defined as in 5.8, 
there is an algorithm which computes from {P, <P) in worst-case time 0(n^ -\- 
n-r -\- r-log(r)). 

Proof. There are 2n-|-4 (= 0{n)) distinct sets of the form SingAsc(p), two for each 
element of Aug(P). Transitive closure has the same computational complexity 
as matrix multiplication d 10.3.6]; thus, the equivalence relation may be 
computed in time 0{n^) from {Facts({p} U Rules(^)) | p G Prop^(P)}. Thus, in 
view of 5.8, the total complexity is 0{n^) -F 0{n -F r-s -F r -log(r)) = 0{n^ -F n- 
r -F r-log(r)). □ 
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The above bound is a substantial improvement over 0(n^-r-log(r)) which 
was reported for the algorithm in m Sec. 2.1]. The general idea also lends itself 
to extension, as outlined below. 

5.13 Higher-Level Static Consistency Checking 

Although the technique just described detects many element equivalences, it 
cannot find them all. As a concrete example, consider the open specification 
{P,<P) with P = {Ti \ \ < i < 5}, and ^ = {(r^ V = T 4 ) | 1 < * < J < 
3} U {(Ti A Tj = Ts) I 1 < i < j < 3}. It is not difficult to see that all five 
elements of P must be collapsed to the same value in any model. However, this 
fact is not detected by the static equivalence algorithm of 5.11. It can, however, 
be detected with a higher-level static consistency check, which works with sets 
of the form Facts(AT, <P), with X a subset of Prop^(P) of size at most two. Thus, 
the technique of static consistency checking may be extended. Indeed, if we 
work with all sets of the form Facts(X, <?) for X any subset of Prop^(P), then 
consistency checking may be made complete. Unfortunately, the computational 
complexity which results when all subsets of P are considered yields no advantage 
over direct satisfaction testing. What may prove promising, though, is to work 
with all subsets of a small size bound, say two or three. The complexity is still 
quite manageable, yet many inconsistencies may be detected. This approach is 
not elaborated further here. 

6 Conclusions and Further Directions 

Open specification clearly imposes a substantial computational burden. There- 
fore, a decision to employ it must be measured carefully. As implied by the work 
in Sec. 2, the first question to ask is whether or not natural semantics are nee- 
ded for both meet and join. If only natural meet semantics is needed, then open 
specification, as described in this paper, is not an issued However, if one is inte- 
rested manipulating general classes of representations which involve disjunction, 
then it may be a necessity, although alternatives to managing limited disjunction 
have been proposed and implemented We cannot and do not address the 
adequacy of such approaches rather we proceed under the assumption that join 
semantics, and hence distributive hierarchies, are desired. 

As illustrated by the construction of 3.7, complete representation of a distri- 
butive hierarchy will generally be infeasible. (See also cn Sec. 0] for some ex- 
amples.) Therefore, it would seem that open specification is the only alternative. 
Although we have presented an efficient method for determining certain implied 
constraints in Sec. 5, such techniques, by themselves, cannot counterbalance the 
NP-hardness of the underlying problems. Rather, techniques for tackling these 
underlying problems directly must be developed. Fortunately, NP-hard problems 

^ It is unclear whether the idea of open specification of meet-only hierarchies is in- 
teresting, or nontrivial. We know of no existing work on the topic, and no systems 
which embody the idea. 
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are very common, and there is a substantial body of research on computational 
methods However, most of these techniques deal with optimization pro- 
blems, in which the notion of an approximate answer makes sense. On the other 
hand, as made clear in 4.7, the questions involved in open specification are rela- 
ted closely to satisfiability questions for logical formulas; such problems do not 
admit useful notions of approximation. Fortunately, there is an active body of 
research on such problems, as well as a library of tools known SATLIB - The 
Satisfiability Library, which is available on the world-wide web. Our next steps 
in addressing the problems of open specification must clearly be experimental 
ones, and will proceeds as follows. 

1. Direct solution of the associated logic problems, as characterized in 4.7, will 
be addressed using SATLIB tools such as GSAT 1231 , a tool which is effective 
in the solution of large satisfiability problems. 

2. A study of techniques related to re-use. One of the features of the satisfiability 
problems surrounding open specification is that not one, but a whole family 
of satisfiability problems (one for each two-element model) must be obtained. 
It is clear from the results of Sec. 4, and 4.7 in particular, that the members 
of the families of formulas to be solved are closely related; they may differ 
in only slightly. Therefore, techniques which solve whole families of related 
formulas in a fashion more efficient than solving each individually must be 
developed. As far as we know, such techniques are not part of current work 
on the subject. 

3. The alternate characterization of two element models presented in 5.4 suggests 
that techniques which address re-use in the specific context of Horn and XOR- 
constraints, may prove useful. As far as we know, SATLIB-style results for 
such specially conditioned formulas do not exist at this time. 
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Abstract. The paper proposes a type logical reformulation of Jacob- 
son’s ([9]) treatment of anaphoric dependencies in Categorial Grammar. 
To this end, the associative Lambek Calculus is extended with a new con- 
nective. In the first part, its proof theory is introduced and the logical 
properties of the resulting calculus are discussed. In the second part, this 
system is applied to several linguistic phenomena concerning the inter- 
action of pronominal anaphora, VP ellipsis and quantifier scope. Finally, 
a possible extension to cross-sentential anaphora is considered. 



1 Introduction 

Anaphora phenomena are a challenge to Categorial Grammar (CG) for several 
reasons. To start with, the standard treatment of anaphoric pronouns as vari- 
ables is not really viable in CG due to its essentially variable free design. This is 
self-evident in Combinatory Categorial Grammar (CCG) since here semantic op- 
erations are by definition restricted to combinators (in the sense of Combinatory 
Logic). Thus in CCG the grammar does not provide variable binding devices. 
Researchers working in the framework of Type Logical Grammar usually don’t 
stress the variable free setup of the syntax-semantics interface, but things are 
not different here than in CCG. This becomes clear if one acknowledges the fact 
that the type logics used in CG can be seen as fragments of positive Intuitionistic 
Logic. Thus all type logical proof terms (= admissible semantic operations) can 
be expressed as combinatory terms (using only Curry’s S and K). 

If one accepts that anaphora does not involve variables, one is faced with 
another problem. Virtually by definition, anaphora involves a multiple use of 
semantic resources. To put it the other way round, anaphors are expressions 
that use a resource without consuming it. In resource conscious logics, re-use 
of resources is characteristic for systems that contain the structural rule of con- 
traction, like Relevance Logic or Intuitionistic Logic. Such logics lack the finite 
reading property, i.e one and same valid sequent may have infinitely many proof 
terms. A simple but telling example is the sequent p — >■ p p — >■ p. All terms 

Xx.f^x for finite n are valid relevant or intuitionistic proof terms. Natural lan- 
guage utterances are at most finitely ambiguous though. The finite reading prop- 
erty is thus essential for an adequate grammar logic. So a type logical treatment 
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of anaphora should be strong enough to admit multiple use of resources, but 
weak enough to have the finite reading property. 

In this paper, we first propose an extension of the Lambek Calculus L ([13]) 
that overcomes this problem. In the sequel, we illustrate its linguistic appli- 
cations. After discussing the treatment of anaphoric pronouns, we study the 
interaction of this system with Moortgat’s account of quantification ([15,16]). 
Next we extend the analysis to VP ellipsis (VPE henceforth) and discuss the in- 
terplay of pronominal anaphora and quantification with VPE. Finally we sketch 
how cross-sentential anaphora can be handled in this setup. 

2 The Logic L| 

2.1 Background: Jacobson’s Proposal 

In a series of publications ([6, 7, 8, 9]), Pauline Jacobson has shown how pronomi- 
nal anaphora can be handled successfully in CCG while maintaining the variable 
freeness of this framework. The basic intuition of her proposal is the idea that 
anaphoric expressions denote functions from antecedent meanings to contextu- 
ally determined meanings. Applied to anaphoric pronouns, this means that they 
denote the identity function on individuals. Due to the strict category-to-type 
correspondence in CG, this must be reflected in the syntactic category. To this 
end, she extends the inventory of type forming connectives of GGG with a third 
slash |. A sign of category A\B is an item that needs an antecedent of cate- 
gory B to behave like a sign of category A. Accordingly, its denotation will 
be a function from B-denotations to A-denotation. In other words, | creates a 
functional category like the other two slashes. Under this account, anaphoric 
pronouns have category A^|A^ and denote the identity function on individuals. 
The non-local nature of anaphora is taken care of by a generalized version of 
function composition which makes arbitrarily large portions of syntactic material 
transparent for anaphoric dependencies. The job of connecting an anaphor to 
its antecedent — multiplication of a resource in the constructive jargon — is taken 
over by a modified version of Gurry’s S that she calls Z. 

The logic that is introduced in the next subsection can be seen as a transla- 
tion of Jacobson’s ideas into the type logical architecture. In particular, we will 
adopt Jacobson’s slash and its intuitive interpretation, including the functional 
semantics of anaphors. Furthermore all relevant combinators of her system are 
theorems of our logic. Nonetheless — due to the fact that the Lambek Galculus 
is completely associative but GGG isn’t — the empirical predictions of the two 
approaches do not completely coincide. In Sect. 5 we will try to demonstrate 
that the move from combinatory to type logical design is advantageous from a 
descriptive point of view. 

2.2 Sequent Presentation 

The set of formulas T of the logic L| is defined as the closure of some set A of 
atomic formulas under the binary operations \, •, / and |. 
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The sequent presentation of L| extends Lambek’s ([13]) syntactic calculus 
L with the logical rules for the third slash Sequent rules are augmented 
with Curry-Howard terms. 



id 

X : X : A 

X ^ M : A Y,x: A,Z ^ N : B 

Cut 

Y,X,Z ^ N[x ^ M]:B 

X,x: A,y. B,Y ^ M : C 

X , z : A • B ,Y M[x ^ (.z)o) V ^ {z)i\ : C 

X ^ M : A Y ^ N : B 

•R 

X,Y ^ {M,N) ■. A*B 

X^M-.A Y,x: B,Z^ N :C 

/L 

Y, y : B/A, X,Z^ N[x ^ (yM)] : C 

X, x : A^ M : B 

/R with X non-empty 

X ^ Xx.M : B/A 

X^M-.A Y,x: B,Z ^ N -.C 

\L 

Y, X,y. A\B,Z^ N[x^ {yM)\ : C 

x: A,X ^ M : B 

\R with X non-empty 

X ^ Xx.M :A\B 

Y^M-.B X,x : B,Z,y. A,W ^ N : C 
X,Y,Z,z: A\B,W^ N[x ^ M][y ^ {zM)] : C 



X : B,y : p,X ^ (M, y,N) : B •p* A x : B,X ^ {M, N) : B • A 

X ^ Xx.N : A\B 

with 



\R 



X non-empty 

p is atomic and does not occur in A, B, X 
M =apr] ^ 



A few comments are in order. Even though the Jacobsonian slash has a func- 
tional semantics and thus resembles the standard categorial slashes, it has a 
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different nature since it cannot be reconstructed as a residuation operation of 
some product operator. In the presence of cut, rule \L is equivalent to the axioms 

(1) a,. X : A, y : B\A {x, (yx)) : A» B 

b. X : A,y : B^ z : C\A (x, y, {zx)) : A» B • C 

Together with the rule of proof |i? this expresses the intuition that a sign has 
category A\B ii and only if it behaves like a sign of category A in the presence of 
an antecedent of type B. Anaphor and antecedent may, but need not be adjacent. 
This motivates the two premises in \R. Since the atom p in rule |i? does not occur 
anywhere else, it behaves like a variable over the material between anaphor and 
antecedent. Furthermore it has to be ensured that neither the antecedent nor 
the material in between are affected by anaphora resolution. This motivates the 
constraint of the Curry-Howard terms in the premises of |i?. So labeling is not 
just a book keeping device but a genuine restriction here. 

L| has the desired proof theoretic properties. To start with, cut elimination 
is possible. 

Theorem 1 (Cut Elimination) 

Cut is admissible in L| . 

Proof. The proof relies on the following two lemmas: 

Lemma 1 If iT is a correct proof and p an atomic formula that occurs in the 
conclusion of 7T, then n[p ^ B] is a correct proof as well, where II [p ^ B] is 
the result of replacing all occurrences of p in iT by B. 

Proof. By Induction over the complexity of iT. □ 

Lemma 2 With X* we refer to the formula that results from replacing all 
commas in A by •. Then it holds that 

h A ^ A* 

Proof. Induction over the length of A. □ 

The cut elimination algorithm follows the one given in [13]. Principal cut for 
I (see Fig. 1 for the case where the Z in |T is non-empty and Fig. 2 for the case 
where it is empty) apparently poses a problem since it may replace one cut by 
three cuts of a higher degree. Nevertheless it can be shown that the algorithm 
always terminates, for the following reason: We call a proof special iff all atoms 
occurring in it are pairwise different unless a sequent rule requires them to be 
identical. Clearly every proof can be transformed into a special proof by renaming 
of atoms. If we restrict our attention to special proofs, the principal cut for j 
reduces the total number of atoms occurring in the proof (since p completely 
disappears). This parameter is not increased by any other cut elimination step. 
This guarantees that cut elimination eventually terminates. The restriction to 
special proofs is no real restriction since we can always transform a given proof 
into a special proof via renaming, perform cut elimination and finally reverse 
the renaming. □ 
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77i 



il2 



ila 



Hi 



B,p,W ^ B»p»A B,W^B»W Y^B X,B,Z,A,U ^ C 
|i? \L 



W ^ A\B 



X,Y,Z,A\B,U ^ C 



X, Y,Z,W,U 



■ Cut 



Bi 



Bz 



B^\p^Z'] 



X,B,Z,A,U^ C 



•L 



Z ^ Z‘ 



■ Im 2 



Y^B B,Z',W ^ B»Z‘ »A 
Y,Z\W ^ B»Z‘ »A 



■ Cut 



X,B»Z‘ »A,U ^C 



•L 



X, Y, Z‘, W,U ^C 



Cut 



Cut 



X, Y,Z,W,U^C 

Fig. 1. Principal cut for |, Z non-empty 



ill 



il2 



Bz 



Bi 



B,p,W ^ B»p»A B,W^B»W Y^B X,B,A,U^C 
|ii \L 



W ^ A\B 



X,Y,A\B,U ^ C 



X,Y,W,U ^C 



■ Cut 



Bz B2 B4 



Y^B B,W^B»A X,B,A,U^C 

Cut «L 

Y,W ^ B» A X,B»A,U ^C 

Cut 

X,Y,W,U ^C 

Fig. 2. Principal cut for |, Z empty 



Although |i? lacks the subformula property, every sequent rule except cut in- 
creases complexity in the sense defined below: 

Definition 1 

1. d{p) = 1 with p atomic 

2. d{A o B) = d(A) -I- d{B) + 1, where o ranges over •, \ and / 

3. d{A\B) = d{A) + 2d{B) + 5 

4. d{A\, . . . , An -B) = d{A\) -!-••• d(A„) -I- d{B) 

Theorem 2 The proof search space for L| is finite. 
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Proof. Cut free proof search reduces complexity, and at every point in proof 
search, only finitely many sequent rules are applicable. □ 

Corollary 1 (Decidability) L| is decidable. 

Proof. Immediately from Theorem 2. □ 

Corollary 2 (Finite Reading Property) L| has the finite reading property. 
Proof. Immediately from Theorem 2. □ 



2.3 Natural Deduction 

The sequent system is indispensable since it guarantees decidability, but for 
practical purposes it is rather awkward. A presentation in natural deduction 
(ND) format is better suited to present concrete derivations. Besides, it has an 
appealing allusion to the tree format linguists are used to. 

We start with a sequent style presentation of the natural deduction system 
(Fig. 3). Besides the identity rule and the cut rule (which are identical to the 
corresponding rules in the sequent system and therefore omitted), we have an 
introduction rule and an elimination rule for each connective. 



•! 

X : A,y : B ^ (x,y) : A* B 

x:A,X^M:B 

V 

X ^ Xx.M :A\B 
X,x : A^ M ■. B 

II 

X ^ Xx.M : BjA 



X ^ M : A»B Y,x:A,y.B,Z^N-.C 

»E 

Y,X,Z^ N[x ^ iM)o][y ^ (M)i] : C 

X^M-.A Y^N:A\B 

\E 

X,Y ^ {NM) : B 

X ^ M ■. A/B Y ^ N : B 

/E 

X,Y ^ (MN) : A 



X ^ M : A Y ^ N : B Z ^O: C\A 

\E 

X,Y,Z ^ {M, N, (OM)) :A»B»C 

X : B,y :p,X ^ (N,y,M) ■. B»p» A X : B, X ^ (N, M) : B • A 

\I 

X ^ Xx.M : A\B 

p not occurring in A, B, X 
M =ck/3tj X 



Fig. 3. Natural Deduction L| 



Natural deductions are more conveniently carried out in tree form. The build- 
ing blocks are given in Fig. 4. Note that a complete deduction always ends in 
a single conclusion, despite the fact that • elimination and | elimination have 
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multiple conclusions. For simplicity we combined the |-elimination rule with two 
subsequent applications of the •-elimination rule, thus removing the products in 
the conclusion of | -elimination. The parentheses in the premises of | introduc- 
tion indicate that these premises must be derivable both with and without the 
material in parentheses. Note that the only structural constraint on anaphora 



M : A N : B 

•/ 

(M,N) : A»B 



M ■. A»B 

•E 

(M)o : A (M)i : B 



i . . 

X : A : : 



M : B 

\7,i 

\x.M -.A\B 

. . i 

: : X : A 



M : B 

/I,i 

Xx.M : B/A 

i i . 

x: B (y-p) ■ 



{x, {y, )M) : B • {p»)A 
Xx.M : A\B 



\I,i 



M ■. A N : A\B 

\E 

(NM) : B 



M : A/B N : B 

/E 

(MN) : A 



M-.B ■■■ N-.A\B 

\E 

M-.B ■■■ {NM) : A 



Fig. 4. Natural deduction in tree format 



resolution is the requirement that the antecedent precede the anaphor. No com- 
mand relations of whatever kind are involved. 

For better readability and to stress the similarity to conventional coindexing 
of constituents, we simplify the notation for \E somewhat (see Fig. 5). When 



[M : B]i 



[N : A\B]i 

\E 

(NM) : A 



Fig. 5. Simplified notation for \E 
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\x ■. A\i y: B 

•/ 

(x,y) : A» B [z : C\A]i 

\7,1 \E 

Xx.{x, y) ■. A \ (A • B) (zx) : C 

•/ 

{\x.{x, y), {zx)) : {A\{A» B))»C 

Fig. 6. An illicit Natural Deduction derivation 



working with Natural Deduction in tree format, it has to be kept in mind that 
the domain of rule applications are complete proof trees, not arbitrary subtrees 
of a proof tree. This is particularly important when \E is involved. Both parts 
of an anaphoric link belong to one and the same tree. Therefore it is illicit to let 
another rule operate on a subtree that includes one part of the anaphoric link 
and excludes the other. (An example of a violation of this constraint is given 
in Fig. 6. Here the premise a: : A is connected with the premise z : (7|A by an 
anaphoric link, i.e an application of \E, but the scope of \I includes the former 
and excludes the latter.) This blocks derivations where the proof term of the 
conclusion contains free variables that do not correspond to any premise. 

3 Pronouns and Quantification 

Following Jacobson, we assume that pronouns like he have category N\N and 
denote the identity function on individuals, i.e the associated semantic term is 
Xx.x. For a simple example like 

(2) John said he walked 

where the only potential antecedent of the pronoun is a proper noun, we have 
the two possible derivations shown in Fig. 7, corresponding to the coreferential 
and the free reading of the pronoun. 

Things become somewhat more involved when we consider possible interac- 
tion of anaphora resolution with hypothetical reasoning. Nothing prevents us 
from using a hypothesis of the appropriate type as antecedent for anaphora 
resolution. For example, in the VP 

(3) said he walked 

the pronoun can be linked to the subject argument place of the VP, as Fig. 8 
demonstrates. This VP can for instance be combined with a subject relative 
pronoun to yield the relative clause who said he walked. Another type of con- 
struction where this kind of derivation is crucial are sloppy readings of VPE that 
will be discussed below. 

Binding to hypothetical antecedents is not restricted to slash introduction 
rules. Another obvious case in point is the interaction of anaphora with quan- 
tification. Here we adopt the type logical treatment of quantification that was 
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he 



■ lex 



\\x.x : A^'IA^ji walked 

-\E lex 



John SAY : N\ S/S 

lex 



said J : N 

lex 



WALK : N\S 



WALK j : S 



■\E 



[j : N]i 



say(walk ]) : N\S 



-/E 



say(walk j)j : S 



\E 



he 



lex 

[N\N]i 

Xx.x walked 

\E lex 



I I John SAY : N \ S/S 



said x : N WK : N \ S 

lex \E 



WK X : S 



X : [iV]i y : p 

N • p 

{x,y} 



•I 



J : N 



lex 



say(wk x) : N\S 



■/E 



say(wk x)i : S 



\E 



{x, y, say(wk x)J) : N • p» S 



•I 



Ax.say(wk x)j : S\N 

Fig. 7. Derivations of John said he walked 



he 

lex 

\\x.x : walked 

\E lex 



said x : N 

■ lex 



WALK : N\S 



SAY -.N\ s/s 



WALK X : S 



\E 



[x : iV]i 



SAY (walk x) : N\S 



■/E 



say(walk x)x : S 



■\E 



U,1 



Ax.say(walk x)x : S 

Fig. 8. Derivation of said he walked 



proposed by Michael Moortgat (see for instance [15]). To repeat the basic in- 
gredients very briefly, Moortgat proposes a new three place type constructor q. 
A sign a has category q{A,B,C) iff replacing a sign of category A by a in the 
context of a super-constituent of type B, the result will have category C. This is 
reflected by the Natural Deduction rules in Fig. 9. The elimination rule involves 
hypothetical reasoning and can thus lead to binding of anaphors. Let us consider 
the example 



(4) Everybody said he walked 
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X : q{A, B, C) 
i . 



: : qI 

Aj/.i/a; : q{A, B, B) 

a : B 

qE,i 

x{Xy.a) : C 

Fig. 9. Natural Deduction rules for q{A, B, C) 



Quantifiers like everybody have category q{N,S,S), so in the course of scop- 
ing the quantifier, a hypothesis of category N is temporarily introduced. This 
hypothesis can in turn serve as antecedent of his, as illustrated in Fig. 10. 



he 



■ lex 



everybody 



said 



[Xx.x : A'lA'Ji walked 

\E lex 



X : N 



EVERY : q{N, S, S) SAY : N \ S/S 
1 



WALK : N\S 



WALK X : S 



■\E 



[x : N]i 



SAy(walk x) : N\S 



■/E 



say(walk x)x : S 



\E 



qE,l 



every(Ax.SAy(walk x)x) ; S 
Fig. 10. Derivation of Everybody said he walked 



If we reverse the order of the quantifier and the pronoun as in (5), the derivation 
of a bound reading will fail, even though the pronoun is in the scope of the 
quantifier. 

(5) *Hei said everybody^ walked 

This configuration — a Strong Crossover violation — is ruled out since the hypoth- 
esis that temporarily replaces the quantifier does not precede the pronoun. Thus 
|-elimination cannot be applied. 

As any ND rule, g-elimination can only be applied to complete trees. If the 
hypothetical N that is used in qE serves as the antecedent of a pronoun, this 
pronoun must be in the scope of qE. Linguistically speaking, this means that a 
bound pronoun is always in the scope of its binder. This excludes for instance a 
wide scope reading of the indefinite object in (6) if the pronoun is bound by the 
subject. 

(6) Every man saw a friend of his 

The way the present system excludes such readings is similar to the one proposed 
in [19], even though the treatment of pronouns in general is different. 
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Finally it should be stressed that the only constraints on pronoun binding 
here are the requirements that 1. the quantifier precedes the pronoun, and 2. the 
pronoun is in the scope of the quantifier. So the derivation of a construction 
like (7), where the binding quantifier does not c-command the pronoun under 
the standard conception of constituency, does not differ substantially from the 
previous example (Fig. 11). 

(7) Everybody’s mother loves him 



everybody 
q{N,S, S) 

EVERY 

[N]i 

y 



■ lex 



N \ N/CN 

OF 



N/CN 

OF y 



N 

OF ^MOTHER 



lex 






\E 


mother 


lex 


CN 




MOTHER 


/E 



loves 

N\S/N 

LOVES 



■ lex 



him 

[lV|iV], 

\x.x 



■ lex 



\E 



N 

y 



N\S 

LOVE y 



/E 



qE,l 



LOVE yiOY ^mother) 

S 

EVERY(Aj/.LOVE y{OF t/MOTHER)) 
Fig. 11. Everybody’s mother loves him 






Again, if we change the order of pronoun and quantifier, the derivation will fail 
since the precedence requirement for \E is not met. 

(8) *His mother loves everybody 

So the precedence requirement also accounts for Weak Crossover violations. 

As mentioned above, binding to hypothetical antecedents is not restricted to 
quantification. Another obvious case in point is ui/i-movement . There are several 
proposals for a type logical treatment of this complex around in the literature (see 
for instance [14,18]). Despite the differences in detail, they share the assumption 
that a hypothesis is put into the “base position” of w/i-movement that is later 
discharged and bound by the operator. So we correctly predict the same patterns 
concerning the interaction of binding and scope and with respect to Crossover 
phenomena. Arguably, association with focus involves hypothetical reasoning as 
well (see [10] for an attempt to spell this idea out in a multi-modal framework). 
Accordingly, we find bound readings and Crossover effects if the antecedent of 
a pronoun is focused. (The latter observation was initially made in [1]. Example 
(10) is from [20].) 
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(9) Only JOHN hates his mother 

(10) a. We only expect HIM to be betrayed by the woman he loves 
b. We only expect him to be betrayed by the woman HE loves 

The pronoun in sentence (9) can be bound, i.e. the sentence can mean John is 
the only x such that x loves x’s mother. Example (10) illustrates that binding by 
focus displays Weak Crossover effects. Sentence (10a) has a reading saying that 
the referent of him is the only person x such that we expect x to be betrayed by 
the woman x loves. No such reading is available in (10b). 

4 VP Ellipsis 

This treatment of anaphora can straightforwardly be extended to VPE. Ignor- 
ing matters of tense and mood, we treat the stranded auxiliary in the second 
conjunct of constructions like (11) as a proform for VPs. 

(11) John walked, and Bill did too 

So did will be assigned the category {N \ S)\{N \ S) and the meaning XP.P, i.e 
the identity function on properties. The derivation for (11) is given in Fig. 12 
(we also ignore the contribution of too since it is irrelevant for the semantics of 
VPE, though not for the pragmatics). 



did 



■ lex 



[\P.P]i 

^lex 



John walked 
lex lex 



and 



J 

N 



[wALKji 

N\S 



WALK J 

S 



\E 



AND 

S\S/S 



■ lex 



B 

N 



WALK 

N\S 



WALK B 

S 



\E 



and(walk b) 
S\S 



/E 



AND (walk b)(wALK j) 
S 



\E 



Fig. 12. John walked, and Bill did (too) 



What makes VPE an interesting topic is of course its complex interaction with 
pronominal anaphora and quantification. Due to limitation of space, we cannot 
give an in-depth investigation of these issues here. Instead we will content our- 
selves with a discussion of some of the most frequently discussed examples from 
the literature. 

The first non-trivial issue in this connection is the well-known strict/sloppy 
ambiguity in constructions like (12). 
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(12) John revised his paper, and Harry did too 

The crucial step for the derivation of the sloppy reading is already given for 
an analogous example in Fig. 8: the pronoun is bound to the subject argument 
place of the source VP. From this we can continue the derivation completely 
in parallel to Fig. 12, and we end up with the meaning ANd(say(walk b)b) 
(say(walk j)j). Crucially, here the pronoun was bound by a hypothetical an- 
tecedent. Of course it is also licit to bind to pronoun to the actual subject John 
and then doing ellipsis resolution, which results in the strict reading. The deriva- 
tion of both readings is given in Fig. 13. 



did 



lex 



John 



revised his paper 

[Aa;.R(pa:)]i 
{N\S)\N 



[AP.P], 

Harry {N \ S)\(N \ S) ^ ^ 
lex \E 



■ Nlex 



[r(pJ)]j 

N\S 



\E 



and 



r(pj)j 

S 



\E 



and 

S\S/S 



lex 



H 

N 



r(pj) 

N\S 



r(p j)h 
S 



\E 



and(r(p j)h) 
S\S 



and(r(p j)h)(r(p j)j) 

s 



\E 



/E 



John 



[*]i 

N 



rev. his paper 
[Ax.R(pa;)]i 

1 



r(px) 

N\S 



did 



R(p®)a: 

S 



\E 



[^P-Ph 

Harry {N \ S)\{N \ S) 



lex 



■ lex 



■ [Ax.R(pa;)a;]. 

N\S 

r(pj)j 

S 






and 



■\E 



AND 

S\S/S 



■ lex 



H 

N 



Xx.r{px)x 

N\S 



\E 



r(p h)h 
S 



\E 



and(r(p h)h) 
S\S 



/E 



and(r(p h)h)(r(p j)j) 

s 



\E 



Fig. 13. Derivation of the strict and the sloppy reading of (12) 



Next we would like to draw attention to a kind of ambiguity that arises from 
the interplay of quantification and VPE. Consider the following example. 
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(13) a. John met everybody before Bill did 

b. John met everybody before Bill met everybody 

c. John met everybody before Bill met him 

As Ivan Sag observed in [21], constructions like (13a) are ambiguous between 
a reading synonymous to (13b) and one synonymous to (13c). Under the present 
approach, reading (13b) arises if the quantifier is scoped before ellipsis resolution 
takes place. If scoping is postponed until after ellipsis resolution, the antecedent 
of the ellipsis still contains a hypothetical N, and accordingly the quantifier binds 
two occurrences of the corresponding variable. Fig. 14 gives the derivations for 
the source VPs of the two readings. 



met 



— 1 
N 



N\S/N 

MEET 



lex 



everybody 
q{N,S, S) 

EVERY 



■ lex 



N 

y 



N\S 

MEETy 



s 

MEETy® 



\E 



S 

EVERY(Ay.MEETyx) 

[N\S]i 

Xx. EVERY (Xy.MEETyx) 



qE,2 



\Ai 



/E 



met 



N\S/N 

MEET 



■ lex 



everybody 

q{N,S,S) 

EVERY 



■ lex 



[N\S]i 

MEETX 



N 

X 



■IE 



Fig. 14. Source VPs in (13a) 



Again this phenomenon is not restricted to quantification. Whenever the deriva- 
tion of the source VP involves hypothetical reasoning and it is possible to dis- 
charge the hypothesis after ellipsis resolution, multiple binding should be possi- 
ble. This is in fact the case. For w/i-movement, this was also observed in [21]. 

(14) a. (the man) that Mary met before Bill did 

b. How many miles are you prepared to walk if the people want you to 

The preferred reading of (14a) is the man that Mary met before Bill met him. 
Example (14b) is similar. Additionally it illustrates that this mechanism is not 
restricted to hypotheses of type N, and that constructions with “binding into 
ellipsis” need not correspond to a parallel construction without ellipsis and with 
a pronoun. 

Last but not least, we encounter the same pattern in connection with focus. 
This was first noted [12], where the following example is given: 
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(15) I only went to TANGLEWOOD because you did 

The sentence has a reading saying: The only place x such that I went to x because 
you went to x is Tanglewood. 

Let me close this section with an example from [2] that demonstrates that the 
ambiguity of bound versus corefential interpretation of pronouns on the one hand 
and the strict/sloppy ambiguity on the other hand are independent phenomena: 

(16) a. Every student revised his paper before the teacher did 

b. Every student* revised hisj paper before the teacher^ revised his^ paper 

c. Every student^ revised his^ paper before the teacher^- revised hisj paper 

d. Every studenti revised hisi paper before the teacher j revised hisi paper 

Sentence (16a) has three reading (paraphrased in (16b-d)). Next to the un- 
problematic cases where the pronoun is either free and strict (b) or bound and 
sloppy (c), there is an interpretation where the pronoun is bound but neverthe- 
less strict (d). Gawron and Peters therefore assume a three-way ambiguity of 
pronoun uses — referential as in (b), role-linking as in (c), and co-parametric as 
in (d) (cf. [2]). 

In the present systems, all three readings fall out immediately, even though 
the pronoun is unambiguous. If the pronoun is free, the derivation is analogous 
to Fig. 7. Readings (16c, d) are derived by first plugging in a hypothetical N into 
to the matrix subject position, giving the ellipsis a sloppy or strict construal 
(as in Fig. 13), and applying qE and thus replacing the hypothetical N by the 
quantifier. 



5 Comparison to Jacobson’s System 

Despite the overall similarity between Jacobson’s and the present treatment of 
anaphora, there are two important differences. As can be seen from rule |i?, 
according to our proposal a pronoun is licensed primarily by a preceding an- 
tecedent. This antecedent may be hypothetical. In this case, the pronoun may be 
linked to an argument place of a super-ordinate functor. This option is employed 
in the derivation of bound and sloppy readings. Jacobson takes this method of 
binding to be basic. Due to the lack of unrestricted associativity in GGG, the re- 
striction to super-ordinate functors is non-trivial here. This aspect of her system 
leads to two shortcomings that can be avoided in the type logical setting.^ 
First, not all bound pronouns can be treated in this fashion. In (17) (from 
[2]), there is no constituent that contains the pronoun and takes its antecedent as 
an argument (even under the flexible notion of constituency that GGG adopts) . 

(17) The soldiers turned some citizens in [each state]* over to itSi governor 



^ The same objections apply to the proposals in [5] and [22] as well. 
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In the type logical formulation presented here, the discontinuity of the main 
verb may require an involved treatment, but since the antecedent precedes the 
pronoun, there is no problem with anaphora resolution. 

As for the second point, reconsider the strict reading of (12). The only way to 
give the pronoun a bound reading under the combinatory approach is to bind it to 
the subject argument place of the verb revised. But this means that the property 
to revise John’s paper doesn’t occur as the meaning of any constituent. We only 
get the meaning to revise one’s paper as the meaning of the source VP. So if 
we assume an identity-of-meaning approach to ellipsis, we only get the sloppy 
reading here if the pronoun is construed as bound. The best we can do to derive 
the strict reading is to consider the pronoun as free and accidentally co-referring 
with John. But as (16) demonstrates, strict readings of bound pronouns are 
possible. The combinatory treatment of anaphora cannot handle constructions 
like this one. 

6 Cross-Sentential Anaphora 

To extend the type logical approach to grammar to the discourse level, we have 
to introduce a new type, call it D, for discourses. Besides, type assignment has 
to guarantee at least that every sentence is a discourse, and that appending a 
sentence to a discourse yields a discourse again. So the following two sequents 
should be theorems of L| : 

(18) &. S^D 

b. D,S^D 

Clearly these cannot be derivable sequents if S and D are atomic. So we have to 
replace them by suitable complex types. Two options suggest themselves: both 
S and D should be identified either with I\I or I/I for some type I.^ They both 
have a dynamic flavor: I\I is akin to the view of Discourse Representation The- 
ory, File Change Semantics or Update Semantics (cf. [4,11,24]), where a sentence 
defines a function from information states to information states. I jl resembles 
DMG (Dynamic Montague Grammar, cf. [3]), where a sentence meaning is a 
function from possible continuations to “static” sentence meanings. I’ll adopt 
the latter, DMG-style option here. The semantic type corresponding to I is t (or 
(s,t) if we incorporate intensionality) . A sentence like John walks will receive 
the DMG-style meaning Ap(wALK J A p) . Now consider a sample discourse like 

(19) John walked. He talked 

If S is uniformly replaced by / // in the type assignments, the relevant sequent 
becomes 

^ To be precise, S should be identified with □■*■(///) or □'^(/ \ 7) for some domain 
modality in the sense of [17] to take the special status of sentences into account. 
I’ll ignore this issue here. 
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(20) N, N \ {I/I),N\N, N \ {I /I) ^ I /I 

Following the DMG philosophy again, we assume that verbs like walk and talk 
denote dynamic properties, i.e Aa;p.(({w/T}ALK x) Ap). Fig. 15 shows that the 
sequent is valid and that the derived meaning is 

(21) Ap.((wALK j) A (talk j) Ap) 

The static truth conditions are obtained by applying this to the tautology. 



John walked 

lex lex 

[iVli N\I/I 

J Xxp.WAhKX A P 



He 

[iV|iV]i 

\x.x 

N 

J 



lex 



\E 



talked 



N\I/I 
Xxp.TAhKX A p 



■ lex 



III 

Ap.WALK J A p 



\E 



III 

AP.talk j a p 



\E 



talk j a P 



!E 



walk j a talk j a P 

Vi 



7G1 



Ap.WALK J A talk j a P 
Fig. 15. John walked. He talked 




This treatment can easily be extended to indefinites. Let us assume that an 
indefinite like someone has category (/(fV, /, T) and meaning XP.3xPx. 

(22) Someone walked. He talked. 

As Fig. 16 shows, a discourse like (22) receives type III and the meaning 
Ap.3a;((wALKa;)A(TALKx)Ap). The mechanism of cross-sentential/dynamic bind- 
ing is thus essentially the same as for sentence internal binding. Other quantifiers 
like every man can be prevented from taking discourse scope by the same multi- 
modal mechanisms that block them from outscoping and in coordinate structures 
(see for instance [18]). 

7 Incremental Interpretation 

This treatment of cross-sentential anaphora is compatible with the intuitive re- 
quirement that discourse interpretation works incrementally. Note that in all 
compositional theories of dynamic semantics, the meaning of a sentence includes 
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q{N,I, 
XP3xPx 

m 

y 



lex 



walked 



N\I/I 
Xxp.WAhKX A P 



■ lex 



He 

[AT|iV]i 

Xx.x 

N 

V 



lex 



\E 



talked 



N\I/I 
Xxp.TALKX A p 



■ lex 



III 

Ap.WALK y Ap 



\E 



III 

AP.talk y Ap 



\E —2 



-IE 



TALK y Ap 



IE 



WALK y A TALK y Ap 



■ qE, 1 



3x(wALK X A TALK X Ap) 



1 1, ‘2 



Aj3.3x(wALK X A TALK X A p) 

Fig. 16. Someone walked. He talked 



information about how many old discourse markers are picked up, and how many 
novel discourse markers are introduced. Under the type logical perspective, this 
information has to be encoded in the type of a sentence. So we admit a limited 
polymorphism for the category of sentences. To take two simple examples, A 
man walks will have (among others) the type II(I\N) since it licenses a sub- 
sequent pronoun. He walks will have type (///)|fV, indicating that it contains 
an old discourse referent that needs an antecedent. Generally, we assume that a 
sentence containing n (locally unbound) pronouns and introducing m discourse 
entities will have category (//(/(| A^)"‘))(|fV)”. Semantically, such a sentence will 
denote a function from n individuals and an m-place relation to a proposition. 
Sentence concatenation corresponds to a family of generalized versions of func- 
tion composition, where each argument place of the second sentence may either 
be filled by one of the discourse markers introduced by the first sentence, or 
projected to the discourse as a whole. [23] briefly considers a proposal similar to 
this, but rejects it due to the apparent proliferations of sentence combining op- 
erations. This is no obstacle here though, since all these operations are derivable 
in L|. 



8 Future Work 

Two directions for further research suggest themselves. In its present shape, 
the system cannot cope with cataphora. This phenomenon is very limited and 
many cases should arguably be treated as accidental coreference instead as a 
grammatical dependency (cf. [25] for an insightful discussion), but there are 
undeniable cases of grammatically determined backward binding. 
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Furthermore, it is yet unclear how other types of ellipsis should be incorpo- 
rated. VP ellipsis is simple insofar as it leaves a proform, the stranded auxiliary. 
Insofar as ellipsis resolution is rooted in the lexicon here. This is not the case 
with gapping, stripping etc. So apparently an adequate extension of L| has to 
include devices to create anaphoric types in syntax. In other words, L| is still 
to weak for a treatment of ellipsis in general. Any strengthening of the logic is 
in risk of losing the finite reading property though. 

9 Conclusion 

This paper proposed the logic L| as a type logical reconstruction of Pauline 
Jacobson’s treatment of anaphora in CG. It was shown that L| is weak enough to 
have the finite reading property, but strong enough to handle the multiplication 
of resources that we find in anaphoric dependencies. Paired with Moortgat’s type 
logical approach to quantification, we are able to cope with a substantial amount 
of phenomena concerning pronominal anaphora, VP ellipsis and quantification. 
Finally it was sketched how cross sentential anaphora can be handled under this 
approach. 
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Abstract. This paper describes a system of categorial inference based 
on insights from Lexicalized Tree Adjoining Grammar (LTAG). LTAG is 
a tree-rewriting system and therefore deals with structural, not string, 
adjacency. When looked at from the logic perspective, the nodes of the 
trees become types as in a categorial grammar, with corresponding de- 
ductive connections between parent and daughter nodes. The resulting 
system is based on a hybrid logic, with one logic for building Partial 
Proof Trees, and the other for composing the partial proofs. We reex- 
amine the use of structural modalities in categorial grammar from this 
perspective, concluding that the use of structural modalities can be con- 
siderably simplihed, or even eliminated in some cases. The generative 
power of the hybrid logic system is beyond context-free, as we demon- 
strate with a derivation of the cross-serial dependencies in Dutch. The 
system also inherits polynomial parsing from LTAG. 



1 Introduction 

The purpose of this paper is to describe a perspective on categorial inference 
based on insights from Lexicalized Tree Adjoining Grammar (LTAG). LTAG 
is a tree-rewriting system and therefore deals with structural, and not string, 
adjacency. Being lexicalized, a structure is associated with each lexical item. 
When looked at from the logic perspective, the nodes of the trees become types 
as in a categorial grammar, with corresponding deductive connections between 
parent and daughter nodes. 

The resulting categorial system has some advantages as compared with type- 
logical systems based on the Lambek calculus. First, it inherits polynomial pars- 
ing from LTAG (while complexity of parsing in Lambek grammar is not known 
yet). Second, resource management is localized to the domains of the elemen- 
tary objects of the grammar. This allows a simplification of the use of structural 
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modalities because their use is localized to these domains. Third, key insights 
from LTAG are incorporated into a system of categorial inference — namely, the 
extended domain of locality and consequent factoring of recursion from the do- 
main of dependencies. This means that in some cases the use of a structural 
modality can be eliminated, while in other cases, their role is translated into 
constraints on tree composition (stretching). 



Hazel 

likes |vjp 




Fig. 1. Sample Partial Proof Tree Derivation 



2 Partial Proof Trees 

When viewed from the context of categorial grammar, LTAG can be seen as a 
system of partial proof trees (PPTs) (see 0 for details). The key idea is that 
instead of associating a type with each lexical item, we associate one or more 
partial proof trees, and each tree is obtained by unfolding the arguments of the 
type. 

The basic PPTs then serve as the building blocks of the grammar, and com- 
plex proof trees are obtained by ‘combining’ these PPTs by two operations: 
substitution and stretching, illustrated in Figure E Substitution, shown by the 
dashed lines, makes use of the terminal node of a tree, such as substituting 
the PPT for Bob into an np node in the tree for likes, as shown. The second 
operation, stretching, provides access to an internal node in a tree, illustrated 
by the solid lines in the figure. Here the PPT for passionately is linked to the 
“stretching” of the internal np\s node in the likes tree. 

As the figure illustrates, dependencies are represented in the elementary trees 
by the unfolding process. For example, while the NPs for likes are both clearly 
semantic arguments and so are unfolded, only the verb phrase, but not the 
noun phrase, is an argument of the adverb passionately, and so the tree unfolds 
only to the np\s level. Since we are using the Lambek calculus, function appli- 
cation and conditionalization both play an important role in the construction 
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of the PPTs. However, crucially, trace assumptions, once introduced, must be 
discharged within the same PPT in which they originate. Since, as will be dis- 
cussed, such PPTs are of a small bounded size, this allows for a localization of 
resource management with corresponding computational advantages. The same 
advantages hold for the use of structural modalities, such as modally-licensed 
Permutation, which can be explored only within the scope of an elementary 
tree and therefore have a restricted capacity to influence the whole structure of 
generated strings. 

The operations of substitution and stretching do not affect this localization. 
The important point is that constraints that license the basic PPTs are distinct 
from these operations that combine the PPTs. 

3 Hybrid Logic 

A logical modelling of the PPTs system makes use of a hybrid logic; i.e., two 
kinds of logic are involved. This is a consequence of LTAG being a tree-rewriting 
system. We distinguish the logic of constructing basic trees and combining trees. 
Construction of basic trees is guided by the logic of a CG, while both operations 
of combining trees (substitution and stretching) are encoded by a single rule: Cut. 
The logic of constructing the basic trees is based on the usual understanding of 
structure-sensitive consequence relations between formulas as types. The logic 
of combining trees, however, defines how some set of proofs can be transformed 
into another proof. Therefore the consequence relation of the logic of combining 
trees is defined on proofs. Before giving details of the system, we first discuss an 
illustrative examp leQ and then give a scheme of a general definition. 

3.1 An Example 

(1) ...who Bob meets today 

Consider the example m of a relative clause with an adverb, which is a 
classic example of the use of a Permutation modality in categorial grammar 0 
We will show that hybridization of the inference allows us to avoid the use of 
the structural modality. 

The PPT system derivation is shown in Figure El This derivation takes ad- 
vantage of the possibilities for conditionalization discussed above. As can be 
seen, the tree for meets assumes an NP assumption which is locally discharged. 
The sequents in (|2) model the derivation of the first part of the meets tree. To 
avoid unnecessary detail, we are not discussing the latter part of the tree with 
the use of who. 

(2) a. meets : (np\s) /np, np ^ np\s 
b. Bob : np, np\s => s 

^ With some abuse of notation, and ignoring substitution. 

^ The need for a permutation modality is discussed in more detail in Section O 
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meets 




Fig. 2. Sample Partial Proof Tree Derivation with Conditionalization 





c. 




d. 


(3) 


e. 

a. 


(4) 


a. 

b. 




c. 




d. 




e. 



Bob meets np ^ s 
Bob meets ^ sjnp 
who Bob meets n\n 
n n\n => n 
np\s today => np\s 

meets np today => np\s 
np np\s => s 

Bob meets np today => s 
Bob meets today ^ s/np 
who Bob meets today => n\n 
n n\n => n 



The sequent in Q models the derivation for the today tree. It is important 
that every step of the derivation in is presented to make internal nodes 
available for stretching. 

The sequents in illustrate how the second logic, a result of the hybridiza- 
tion, is used to combine the sequents in 0 and 0. The first step would be cut 
application with premises 0) and 0), resulting in 0). The second step re- 
places meets np with meets np today everywhere in the derivation of 0, resulting 
in 03cde). 

Note that, crucially, the structure of ( 0 ) is not disturbed by this replacement. 
Therefore, while (0c) has the appearance of a violation of the Lambek calculus, 
since the hypothetical np is no longer on the right periphery of the expression in 
the presence of the adverb, this step is justified. The relations between the types 
in an elementary tree are fixed by the creation of the tree. Since the stretching 
process maintains the relations between the types, the non-peripheral extraction 
is legitimate due to the application of the second logic, which does not disturb 
the type relations in the elementary trees. 
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3.2 The Hybrid Logic 

Figure 0 illustrates the logic used for constructing the basic PPTs. As men- 
tioned, it is based on the usual understanding of structure-sensitive consequence 
relations between formulas as types. 

Functional application 

A,A\B, ^ B 

B/A, A ^ B 
Conditionalization 

A, X ^ B 
X ^ A\B 

X, A ^ B 
X ^ B/A 

Cut 

X ^ A Y[A] => C 

Y[X] ^ C 



Fig. 3. Logic of Constructing the Basic PPTs 



The logic for combining PPTs via stretching is as follows: Let be a proof 
containing (*) .A => A and A^ be a proof containing (**) Y A Z => A as the 
last sequent. Then a new proof A^ contains all sequents preceding (*) and (**) 
in and A^ respectively, a new sequent Y X Z ^ A, and all sequents of 
provided that X is replaced by T X Z, with no change to the structure. 

This is illustrated in Figures0and0for the example discussed in this section. 
In Ai in Figure 0 X is meets : (np\s) jrvp np and A is np\s (and so X => A 
is sequent (0t)). In A^ in Figure 01 ^ is empty, A is np\s, and Z is today : 
(np\s)\{np\s) (and so X A Z is sequent (|2 Jl)). Therefore, Ag in Figure0contains 
the new sequent Y X Z ^ A, or meets : {np\s)/np np today : {np\s)\{np\s) => 
np\s, and all sequents of A^ with meets : {np\s)/np np replaced by meets : 
{np\s)/np np today : {np\s)\(np\s) , with no change in structure to A^. This is 
the set of sequents in 0 , with meets np replaced by meets np today. 

This is some relation between this system and that of proposals such as those 
of 0 and 0. The main difference is hybridization and the resulting stronger 
expressivity. Moreover, our system makes essential use of conditionalization. 

4 Structural Control and Partial Proof Trees 

It is well known that the usage of unary modalities (O, nf) is an effective way 
to provide structural control in categorial inference. Categorial systems with 
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I 

I X=>A 





A2 

I 

I 

I 

' YAZ=>A 
Y A Z 




A 



meets 

Bob': I 

/' NP : (NP\S)/NP [NP] 

\ 

[NP] [NP\S] 
who S ^ 

[(N\N)/(S/NP)] S/NP /' 

[N] N\N 

N 



today 

[NP\S] (NP\S)\(NP\S) 
NP\S 



Fig. 4. Logic of Combining the PPTs 




[N] 



meets 



Bob 



NP 



(NP\S)/NP ^ 

(NP\S) 



\ 

[NP] [NP\S] 
who c 

— ^ 1 

[(N\N)/(S/NP)] S/NP 
N\N 



N 



today 

[NP\S] (NP\S)\(NP\S) 
NP\S 



Fig. 5. Logic of Combining the PPTs — Result 
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structural modalities (see for details) can incorporate not only limited 

relaxation of the rigid structure to provide more generative capacity, but also 
impose additional constraints to block undesired derivations. In other words, 
different unary modalities can license structural relaxation and enforce a stronger 
structural discrimination. 

These modalities play a somewhat different role when used in the context of 
this hybrid logic, as we now discuss. 



4.1 Modality Eliminated 

In a multi-modal system, the permutation modality can be used to allow the 
derivation of an object relative clause with an adverb, as in ... who Bob meets 
today. This is because the same type assignments that allow the derivation of a 
simple object relative clause OHl, will not allow the derivation with an adverb, 
as in (0. 

To overcome this difficulty, the type assignment of who is modified, as in (Q , 
which together with the use of the permutation modality allows the np argument 
to move to the needed location. 

(5) who Bob meets 

h rf{sjnp) np {np\s)/np ^ r 

(6) who Bob meets today 

1/ rl{slnp) np {np\s)/np s\s => r 

(7) who Bob meets today 

h r/(s/np*) np {np\s)/np s\s => r 

h np {np\s)/np s\s => s/np^ 

As we have already seen in the example in Figure|2l in the PPT approach, the 

modality is not needed at all. This is because today simply “stretches” into the 
PPT for ..who Bob meets. Thus, the use of the permutation modality to allow 
non-peripheral extraction is handled inherently due to the use of two logics. 

In Section we discuss a more extensive case of the elimination of the need 
for modalities. 



4.2 Modality Required, but Localized 

However, in some cases we still retain the need for a modality but its use can 
be localized with desirable formal consequences. For example, topicalization is 
handled using the permutation modality. However, exactly because its use is 
localized, since every assumption must be discharged within the same elementary 
tree, it does not cause a collapse of the system. This use of local permutation is 
illustrated in Figure 0 for the sentence Apples John likes. 

Unrestricted flexibility of permutation modality may lead to overgeneration. 
Since permutation is localized in this system, the problem does not arise. This 
localization is not a stipulation imposed on the system. That is, it is not the case 
that the logic needs to refer to where the parts of proofs come from. Since we 
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John 

I ; likes 



^ I 

/ 1 permutation 

[NP] NP\S 
S 

Fig. 6. Localized Permutation 



Apples I 



NP 



(NP\S)/NP 4NPT ' 



[NP] (NP\S) 



use different logics, one for constructing the basic PPTs and one for combining 
them, the localization of the first logic to these building blocks (that is, the basic 
PPTs), is inherent in the system, and is not an arbitrary restriction. 




Fig. 7 . Long-Distance Permutation 



Furthermore, no complications are raised by the case of long distance topi- 
calization, as in Apples Bill thinks John likes, as illustrated in Figure E] There 
is no change in the likes tree except that the s node is stretched. The localized 
use of permutation in the likes PPT is completely unaffected by this. The use of 
stretching allows Bill thinks to be inserted into the topicalized PPT for Apples 
John likes, resulting in Apples Bill thinks John likes. 
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4.3 Modality as Constraint on Stretching 

In multi-modal categorial grammars, constraining modalities are needed to pre- 
vent various kinds of overgeneration. Their role can be illustrated by the repre- 
sentation of the relative clause with coordination in with the type assign- 
ments in and the validity of the latter proven by dHfc). 

(8) a. that John wrote and Bob read 

h.\- r /{s/np) { np {np\s)/np {X\O^X)/X np {np\s)/np r 
c. h np (np\s)/np {X\O^X)/X np {np\s)/np => 

(with X = s/np) 0^{s/np) 

(9) *(the book) that John wrote Moby Dick and Bob read 

The modal decorations on the types in OHb) are used to prevent the over- 
generation of an island violation such as (Q. With no such modality constraint, 
(0 could be derived with X instantiated to s, since the np assumption could be 
used to derive an s for Bob read np, which would coordinate with John wrote 
Moby Dick. However, with the modal decoration used for and, and the closing off 
of the coordinate structure with the dual structural modality ()^, the undesired 
derivation fails because the hypothetical np finds itself in the scope of the modal 
operator. 

While this is certainly a valid approach, the PPT system’s hybrid logic allows 
for a somewhat different perspective on this problem. First structural bracketing 
such as 0^ can be understood as a command to make a constituent. With 
PPTs, the constituents do not need to be stipulated — they arise naturally from 
the unfolding of the trees. Second, the effect of the modality is handled by 
a restriction on the schema for coordination, where we feel it most naturally 
belongs. The crucial difference between the two approaches is that the PPTs 
system does not rely on modal type decorations, which can get successively 
more complex as the system becomes more sophisticated, although we do not 
discuss that in full detail here. 

To illustrate. Figure shows a partial proof tree for an object relative 
clause. It is used together with a coordination partial proof tree in Figure EB to 
derive (&)• The tree in Figure EB is inserted into that in Figure EK by the use 
of stretching at the s/np node of the latter, deriving (Et)- 

The violation (0) is ruled out by the most simple of reasons. This sentence 
would require and Bob read to adjoin into a tree for that John wrote Moby Dick. 
The latter tree is simply not a well-formed object relative clause tree since there 
is no gap. 

(10) * ...that John wrote and Bob read Moby Dick 

A more interesting case is that of the corresponding violation dinj, with 
the potential derivation shown in Figure 0 Since that John wrote is clearly an 
acceptable relative clause tree, what needs to be prevented is a tree for and Bob 
read Moby Dicfc stretching in at the s node, therefore coordinating at the s node. 
However, this is invalid because the s node for the that John wrote tree has an 
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John 

lljP 



Bob 

lljP 



, ' wrote 



1 

(NP\S)/NP JNPj 
/ [NP] NP\S 
that S , 



and 



read 

1 

(NP\S)/NP JNP] 
[NP] NP\S 
S 



■1 



[N] 



[(N\N)/(S/NP)j S/NP 
N\N 



N 



((S/NP)\(S/NP))/(S/NP) S/NP S/NP 



[S/NP] 



(S/NP)\(S/NP) 



S/NP 



(A) (B) 

Fig. 8. Object Relative Clause and Coordination Trees 




(A) 



Fig. 9. A Disallowed Derivation 



undischarged assumption while the s node for the and Bob read Moby Dick does 
not. 

When a tree in inserted for stretching in the case of coordination, not only 
must the labels match for the stretching, but the labels must also coincide on any 
undischarged assumptions. So in the case of (uni), the s node in the tree for that 
John wrote is more complex than just s, and also contains a list of undischarged 
assumptions, in this case the object np. In the tree for and Bob read Moby Dick, 
the s node has an empty list of undischarged assumptions. Since the two s nodes 
are not formally the same, the undesired coordination cannot take place. 
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More generally, while other approaches need to stipulate structural modalities 
and bracketing to capture the constituent structure needed for coordination, 
such structure is available for free in our approach. This is due precisely to the 
characterization of the PPTs as the unfolding of a lexical item. A good example 
of the advantages of this approach is given by the string I think that Harry 
and Mary will lend you the money, which is problematic for categorial grammar 
since both I think that Harry and Mary can have type s/{np\s), and therefore 
coordinate with an invalid reading. However, since for us coordination is based on 
more than just a type, but rather on the structural equivalence of the conjuncts, 
I think that Harry and Mary can easily be distinguished as unfit for coordination 
due to their different proof trees. 

5 Cross Serial Dependencies in Dntch 

We now address the classical problem of generating cross-serial dependencies in 
Dutch, as in sentence mu, again using the hybrid logic 0 

(11) dat Jan Piet Marie zag laten zwemmen 
that Jan Piet Marie saw make swim 
N1 N2 N3 VI V2 V3 
‘that Jan saw Piet make Marie swim’ 

Figure cni shows the PPTs used for each clause in the Dutch example. We 
discuss below a logical view of these trees and how they are put together to 
derive the cross-serial dependencies. The crucial point to note about these trees 
is that a hypothetical assumption is used for the verb, which is then discharged 
later in the tree. This is accomplished by letting the anchor of the tree be a 
type-raised noun phrase, rather than the verb, and so the verb itself substitutes 
inO The main point to note for our approach is that we must allow the verbs to 
substitute without having their arguments unfolded. For example, if laten had 
its np and s arguments unfolded as usual, it would form a PPT and not be 
able to substitute into the [s\np\s] argument in PPT (B) in Figure E3 Thus we 
conclude that while verbs may unfold, they are not obligated to.® 

Logic models for these trees are presented in mu, mu> (m> which corre- 
spond to the trees (A),(B),(C) in Figure m For clarity of presentation we ignore 

® This analysis is based on the TAG analysis given in 0. 

^ This accomplishes the same result as verb-raising in the TAG analysis of P), creating 
the internal S nodes in the the trees as a locus for the required stretching. An 
alternative approach is to allow an empty verb to head the tree, which takes the real 
verb as an argument. There are some technical problems with this option, which we 
cannot discuss here. Note also that the simulation of verb raising in the tree for Jan 
zag is not necessary to derive the cross-serial dependencies, although we have kept 
it here. 

® This suggestion had already been made previously in |2]. An alternative is to allow 
the verb and the NP argument to be “co-anchors” of the tree. We leave this option 
aside for now. 
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Fig. 10. Trees for Dutch cross-serial dependencies 




substitution and use the following abbreviation: a stands for s\np\s. Also, since 
the type of (zag) and (laten) is s\np\s, the following sequents are valid: 
s/a Vi => s and s/a ^ s. The sets of sequents CEl, (El, d) correspond in 
a simple way to the PPTs (A),(B),(C). 

(12) tree for Jan 

a. Jan s ^ s/a 

b. s/a a ^ s 
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c. Jan s a ^ s (cut of a,b) 

d. Jan s ^ s/a (conditionalization) 

e. s/a ^ s 

f. Jan s ^ s (cut of d,e) 

(13) tree for Piet 

a. Piet s => s/a 

b. s/a a ^ s 

c. Piet s a => s (cut of a,b) 

d. Piet s => s/a (conditionalization) 

e. s/a V 2 ^ s 

f. Piet s V 2 ^ s (cut of d,e) 

(14) tree for Marie 

a. Marie np\s => s 

b. Marie =^s/(np\s) (conditionalization) 

c. s/{np\s) U 3 => s 

d. Marie Ug s (cut of b,c) 





Piet 




zag i 


Jan 


S/(S\NP\S)/S 


[S] 


S\NP\S 



S/(S\NP\S) 



-tS\NP\STl 



S/(S\NP\S)/S 



S/(S\NP\S) 



-isiMAsr 



S/(S\NP\S) 



S/(S\NP\S) 



laten 

S\NP\S 



[S\NP\S] 



[SVNP\S] 



Fig. 12. The result of stretching (A) into (B) 



Before detailing the use of the second aspect of the hybrid logic (the use 
of cut to compose the PPTs together), we illustrate using the figures of the 
trees how sentence dru) is derived using the PPTs in Figure ITnil . The basic idea 
is shown in Figure O The internal S nodes of the (B) and (C) {v^ and Wg) 
PPTs are stretched and the the (A) and (B) (?;i and v^) PPTs are, respectively, 
inserted. Figure fTTI shows the two stretchings (of (A) into (B), and of (B) into 
(C)) occurring simultaneously. Technically, however, they are two independent 
compositions and must occur in sequence, although it does not matter which 
order they occur in. Here we use the order of (A) stretching into (B), and (B) 
stretching into (C). 
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Marie 

zag 

S\NP\S 



I S/(S\NP\S) 

S/(S\NP\S)/S S 

3 

S/(S\NP\S) 4S\NP\Sr 



S/(S\NP\S) [S\NP\S] : 

s ' ! 

2 ' : 

S/(S\NP\S) [S\NP\S] : 

S S i 

1 I 

S/(NP\S) [NP\S] 

' S 

Fig. 13. The result of stretching Figure El into (C) 



laten ; 

s\NP\s; 



zwemmen 

NP\S 



I 1 

S/(NP\S) 4NP\Sr 



S/(S\NP\S)/S 



FigureElshows the result of (A) stretching into (B). The yield of the result is 
Jan Piet zag laten (N1 N2 VI V2). Note that we have renumbered the assumption 
that is assumed and discharged in (A) from 1 to 2 in Figure O- This is only for 
expository purposes, to avoid confusion with the assumption discharged in the 
tree for Piet laten, which remains as 1 in the figure. The numbers themselves 
of course have no relevance whatsoever — the important point is that there are 
no assumptions or withdrawals taking place as a result of the stretching. The 
dependencies that are present in (A) and (B) remain in Figure ^3 with the 
dependency between the assumed and discharged [s\np\s] in the Piet laten PPT 
getting “stretched apart” by the insertion of the Jan zag tree. Crucially, the yield 
of the tree in Figure 113 is the desired crossed dependencies for the two clauses. 

The final step is the stretching of Figure El into the (C) in figure E3 The 
result is shown in Figure O Once again the hypotheticals have been renumbered, 
with no significance other than appearance. The yield of the tree is sentence (HU. 

We now describe the precise logical steps involved in the derivation, by ap- 
plying our technique of combining PPTs. First, the cut rule is applied to dHi) 
and dlSb), which results in Jan s/a a ^ s/a. Since, according to our strategy, 
we replace s by s/a a in dn», the last step of the tree, corresponding to int), 
is Jan s/a a ^ s. This sequent replaces (llJb i. which means that in (IIJII the 
context s/a a is replaced by the new context Jan s/a a Ui, with no structural 
change. The result is shown in m- 



a. 


Piet 


s => s/a 




b. 


Jan 


s/a a 


^ s 


c. 


Jan 


Piet s a 


Wi => s (from cut of a,b) 


d. 


Jan 


Piet s 


^ s/ a (conditionalization) 


e. 


s/a 


V2 ^ s 




f. 


Jan 


Piet s 


V2 ^ s 
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Now we make a cut between CS) and (di). We show in USD the resulting 
PPT, again the keeping the structure, in this case changing the context from 
Mary np\s to Jan Piet Mary np\s v^- 

(16) a. Jan Piet Mary np\s Ua => s 

b. Jan Piet Mary Ua => s/{np\s) 

(conditionalization) 

c. s/{np\s) Ug => s 

d. Jan Piet Mary => s 

(cut of b,c) 

Unlike the multi-modal approach ( 0 , 0 ) to derive cross-serial dependencies 
in Dutch, the system of categorial inference described here implicitly handles 
permutation without introducing structural modalities, thanks to the properties 
of the hybrid logic. 

We conclude this section with a note on the generative power of this system. 
The cross-serial derivation shows that the PPTs system can derive languages 
that are strongly more powerful than context-free grammar. This construction 
can be easily extended for a strictly context-sensitive language, such as a"&”c". 

6 Conclusion 

We have discussed here some key concerns of categorial grammar from the per- 
spective of a hybrid logic system motivated by the insights of Lexicalized Tree 
Adjoining Grammar. The hybridization of logical inference leads to a localization 
of resource management that simplifies the use of structural modalities to control 
such management. The resulting system can derive languages that are beyond 
context-free. In addition, desirable computational properties are inherited from 
LTAG, in particular polynomial parsing. 

We conclude by mentioning some issues for further work. First, we would 
like to explore the handling of right node raising in this framework. There are 
well-known differences between the leftward extraction discussed here and the 
rightward extraction of the RNR cases. While we cannot discuss it here, the 
prospects are promising because we can distinguish between the two cases, due 
to the use of proof trees rather than just types, and because our coordination 
schema depends on the basic partial proof trees associated with each conjunct. 
The leftward and rightward cases will have different proof trees and the different 
properties can therefore by modelled appropriately. 

We also plan to refine the coordination schema to handle the coordination 
of simple NPs and generalized quantifiers, such as Bob and a boy walked, by 
incorporating deductively-connected types in the coordination schema. 
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Abstract. Dominance constraints for finite tree structures are widely 
used in several areas of computational linguistics including syntax, 
semantics, and discourse. In this paper, we investigate algorithmic and 
complexity questions for dominance constraints and their first-order 
theory. The main result of this paper is that the satisfiability problem of 
dominance constraints is NP-complete. We present two NP algorithms 
for solving dominance constraints, which have been implemented in 
the concurrent constraint programming language Oz. Despite the 
intractability result, the more sophisticated of our algorithms performs 
well in an application to scope underspecification. We also show that 
the positive existential fragment of the first-order theory of dominance 
constraints is NP-complete and that the full first-order theory has 
non-elementary complexity. 

Keywords. Dominance constraints, complexity, computational 
linguistics, underspecification, constraint programming. 



1 Introduction 

Dominance constraints are a popular tool for describing trees throughout com- 
putational linguistics. They allow to express both immediate dominanee (and la- 
beling) relations and general (reflexive, transitive) dominance relations between 
the nodes of a tree. In syntax, they provide for underspecified tree descriptions 
employed in deterministic parsing HH and to combine TAG with unification 
grammars |![|. In underspeciflcation of the semantics of scope ambiguities, dom- 
inance constraints are omnipresent. While they are somewhat implicit in earlier 
approaches they are used explicitly in two recent formalisms |bl I dj . An 

application of dominance constraints in discourse semantics has recently been 
proposed in [HI, and they have been used to model information growth and par- 
tiality j 12) . 

Despite their popularity, there have been no results about the computational 
complexity of solving these constraints, i.e. of finding a tree that satisfies all 

M. Moortgat (Ed.): LACL’98, LNAI 2014, pp. 100- 11^ 2001. 
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given constraints. In this paper, we remedy this situation by proving that the 
satisfiability problem of dominance constraints is NP-complete. This result holds 
for all logical fragments between the purely conjunctive fragment and the positive 
existential fragment. We present two algorithms for solving them; one involves 
a nondeterministic guessing step, which makes it convenient for a completeness 
proof, but not for implementation, whereas the other gives priority to determin- 
istic computation steps and enumerates cases only by need. Finally, we show that 
the first-order theory over dominance constraints with a signature of bounded 
arity is decidable and has non-elementary complexity. The decidability result is 
not new - e.g. US! sketches a proof for a different variant of dominance con- 
straints -, but we work out the details of a transparent proof by encoding into 
second-order monadic logic for the first time. 

Related Work. In m it was shown how to solve formulae from the propositional 
language over (a different variant of) dominance constraints (over a different 
type of trees). There, tableau-style saturation rules for enumerating models are 
presented which are quite similar to the ones we use here. This solution procedure 
terminates, but there are no complexity results. Continuing this line of work, ^ 
present a sets of first-order axioms over dominance constraints which capture 
certain classes of trees. 

From an implementation perspective, dominance constraints were approached 
first in |3|, which presents an implementation based on finite set eonstraints. A 
more advanced version of the algorithm presented here and an implementation 
thereof are given j^. This implementation is also based on finite set constraint 
programming but improves that of |3| . 

Plan of the paper. In Section |21 we start out by defining the syntax and semantics 
of dominance constraints. In Section 0 we present the solution algorithms for 
dominance constraints and prove their soundness, completeness, and NP run- 
times. The algorithms are first defined for the (purely conjunctive) language 
of dominance constraints and extended to the other propositional connectives 
later. In Section E] we complement this result by proving NP-hardness of the 
problem. In fact, we will not really provide the details of the proof, but give a 
thorough explanation of the proof idea. In Section El we turn to the decidability 
and complexity of the first-order theory over dominance constraints. Section El 
summarizes and concludes the paper. Some of the proofs are only sketched; for 
more details, we refer the reader to |B|. 

2 Syntax and Semantics of Dominance Constraints 

In this section, we define the syntax and semantics of dominance constraints. 
To this end, we first introduce the notion of a tree strueture., the kind of first- 
order structure we will interpret dominance constraints over. After that, it will 
be straightforward to define the actual syntax and semantics. Finally, we look 
briefly at constraint graphs, a graphical syntactic alternative. 
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2.1 Tree Structures 

Throughout, we take IN to be the set of positive integers and INq to be the set 
of nonnegative integers and assume that if is a ranked signature that contains 
function symbols or tree constructors f,g,a,b,..., which are assigned arities by 
an arity function ar : 17 — >■ INq. We further assume that S contains at least two 
constructors, one of which is nullary, and the other one of arity at least 2. 
Intuitively, we want trees to be essentially the same as ground terms over S. 
Formally, we first define a tree domain 17 to be a nonempty prefix-closed subset 
of IN*; i.e., the elements of D are words of positive integers. These words can 
be thought of as the paths from the root of a tree to its nodes. We write the 
concatenation of two words tt and tt' as juxtaposition tttt' . 

We define a constructor tree r to be a pair (Dt,Lt-) of a tree domain Dj. and a 
labeling function 

Lr ■■ Dr ^ S, 

with the additional property that for every tt G Dr, Trk G Dr iff 1 < fc < 
ar(LT-(7r)). A finite eonstructor tree is a constructor tree whose domain is finite. 
Throughout, we will simply say “tree” to mean “finite constructor tree”. 

The tree structure Ad” over the tree r is a first-order model structure with the 
universe Dr and whose interpretation function assigns relations over Dr to a 
set of fixed predicate symbols. We will use the same symbols for the predicate 
symbols and their interpretations; as the latter are applied to paths and the for- 
mer are applied to variables, there is no danger of confusion. The interpretation 
function is fully determined by r; so to specify a tree structure, it is sufficient to 
specify the underlying tree. 

In detail, the interpretation is as follows, li f G S has arity n, the labeling 
relation 7r:/(7Ti,... ,7r„) is true in Ad” iff Lriir) = f and for all 1 < i < n, 
7Ti = Tri. The dominance relation is true iff tt is a prefix of tt' . 

2.2 Syntax and Semantics of Dominance Constraints 

With these definitions, it is straightforward to define the syntax and semantics of 
dominance constraints. Assuming a set of (node) variables X,Y, . . . , & dominance 
constraint ip has the following abstract syntax: 

ip::=X:f{Xi,... ,Xn) / G i7, n = ar(/) 

I x<*y 

I 

We use the formula X = Y as an abbreviation for X<\*Y A Y<l*X. 

We will start by considering only this (purely conjunctive) constraint language 
and successively allow more logical connectives, until we have arrived at the full 
first-order language in Sectional 

Satisfaction of an atomic constraint is defined with respect to a pair (Ad”, a) 
of a tree structure Ad” and a variable assignment a : Var — >■ Dr that assigns 
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nodes to the variables. This is extended to satisfaction of arbitrary formulae in 
the usual Tarskian way. 

Because dominance constraints can easily become unreadable, we will use con- 
straint graphs as a graphical device to represent them. They are essentially an 
alternative to the original syntax of the language. Constraint graphs are directed 
graphs with two kinds of edges: solid and dotted. The nodes of a constraint 
graph represent variables in a constraint. Labeling constraints are expressed by 
attaching the constructor to the node and drawing solid edges to the children; 
dominance constraints are expressed by drawing a dotted edge between the re- 
spective nodes. As an example, the graph below is the constraint graph for the 
constraint to its right. 



X,:g{X2)AX2<*X3A 
1 Xs-.f{X4,X^)AX4:a. 

• ^2 




3 Solving Dominance Constraints 

Now we show that the satisfiability problems of all languages over dominance 
constraints between the (purely conjunctive) constraint language itself and the 
positive existential fragment are in NP. We first define an algorithm that decides 
satisfiability for the constraint language and prove the running time, soundness, 
and completeness. Then we present an algorithm that does the same thing, but 
lends itself more easily to implementation. Finally, we extend the results to the 
other propositional connectives. 

It turns out that it’s actually easier to define satisfiability algorithms for domi- 
nance constraints if we additionally allow atomic constraints of the form ^X<\*Y . 
Hence, we are going to work with this extended language of dominance con- 
straints in Sections id.ii to roi 

3.1 The Algorithm 

The first algorithm proceeds in three steps. First, we guess nondeterministically 
for each pair X, Y of variables in if X dominates Y or not, and add the 
corresponding atomic constraint to tp. This is done by the (Choice) rule, where 
or stands for nondeterministic choice. 



(Choice) true — >■ X<\*Y or -•X<i*Y 

In the second step, we saturate p according to the following deterministic prop- 
agation rules. We recall that X = Y is an abbreviation for X<\*Y A Y <\*X. 



no 
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(Refl) true — >■ X<3*X (X occurs in ip) 

(Trans) Xci*Y A Y<l*Z X<i*Z 

(Decomp) X=Y A X:f{Xi, . . . , X„) A Y:f{Yi, . . . ,Y„) ^ Ati Xi=Y, 

(Disj) X:f{... ,X,,... ,Xk,...) -A ~^Xi<*Xk (l<i^k<n) 

(Dorn) X:f{...,Y,...) -A 

(Parent) X=Y A X':f{. .. ,X,...) A Y':g{. .. ,Y,...) -A X'=Y' 

(Child) X<*yAX:/(Xi,... ,X„) AAr=i(-^i<*>") ^ 

In the Parent rule, / and g need not be different. 

In the third step, we detect unsatisfiable constraints by applying the following 
clash rules. 

(Clashl) X:f{...)^Y:g{...)^X=Y -A false, \i f ^ g 
(Clash2) X<i*Z AY<]*Z A^X<]*Y A^Y<]*X -A false 
(ClashS) X<\*Y A ^ false 

(Clash4) X:f{Xi,... ,Xi,...,X„) A X,<i*X -A false 

After the initial guessing step, the algorithm applies all instances of all propaga- 
tion and clash rules. We call a constraint to which no clash rule can be applied 
clash-free, the result of applying all possible rules to a constraint for as long 
as the constraint is clash-free its saturation, and a constraint which is its own 
saturation saturated. The algorithm outputs that its input is satisfiable if it can 
find a clash-free saturation (that is, can apply the guessing step in such a way 
that subsequent propagation and clash rules won’t produce false); otherwise, it 
outputs that the input is unsatisfiable. 

An example for application of these rules is to prove the unsatisfiability of 

X:aAX<l*Y A^Y<l*X, 

where a is a nullary symbol. Application of the (Child) rule adds the new con- 
straint Y<tX. But this makes the (ClashS) rule applicable, so the algorithm 
finds a clash. A really tricky example is to prove the unsatisfiability of 

Y-.f{Z) A X-.g{U) A U<*Z A -^X<*Y. 

The (Dom) and (Trans) rules will give us Y<\*Z, X<\*U, and X<\*Z. Now for 
the saturation to be clash-free, (Choice) must have guessed Y<\*X-, for if it had 
chosen ~Y<i*X, we would get a clash with (Clash2). Similarly, (Choice) must 
have guessed ^Z<\*X, for if it had chosen Z<\*X, we could derive U<\*X, which 
produces a clash with (Clash4). But in this case, we can derive X<\*Y with the 
(Child) rule, which causes a clash with (ClashS). 

It’s easy to see that the algorithm terminates in NP time. As we have guessed the 
dominance relations between all variables in the first step, the second step can 
never consistently add a new constraint; either the constraint is already known, 
or it clashes, by the ClashS rule. So we will only spend deterministic polynomial 
time with the application of propagation and clash rules. Note that one major 
change of the second algorithm below will be to allow the propagation rules to 
be more productive. 




Dominance Constraints: Algorithms and Complexity 111 



Proposition 1 (Soundness). A satisfiable dominance constraint has a clash- 
free saturation. 

Proof Assume that the constraint (p is satisfiable. Clearly, the guessing step of 
the algorithm can add a choice of (possibly negated) dominance constraints such 
that their conjunction p' with p is satisfiable as well; we only have to read off 
whether the denotations of two variables dominate each other in a fixed solution 
of p. Now all propagation rules maintain satisfiability, and the preconditions of 
all clash rules are unsatisfiable. Hence, p' is saturated and clash- free. □ 



3.2 Completeness 

As usual, proving completeness is slightly more involved than proving soundness. 
Here, we proceed in two steps: First we show that a special class of saturated, 
clash-free constraints is satisfiable; then we show that every saturated, clash- 
free constraint can be extended by some additional conjuncts to a saturated, 
clash-free constraint of the restricted class. Together, this shows completeness: 

Proposition 2 (Completeness). A saturated and clash- free constraint is sat- 
isfiable. 

Incidentally, the proof also shows how to obtain a model for a clash-free con- 
straint. But first, some terminology. We call V C V{p) an equality set for p if 
hi <1*^2 in p for all 11, F 2 G V. All variables in an equality set must be mapped 
to the same node in a solution of p. A variable X is labeled in p if there is an 
X' such that {X,X'} is an equality set for p and X':f{X[,... ,Xf) in p for 
some term f{X[, . . . , AT(,). We call a constraint p simple if all its variables are 
labeled, and if there is a so-called root variable Y for p such that Y<\*Z in p 
for all Z G V{p). 

Lemma 1 (Satisfiability of Simple Constraints). A simple, saturated, and 
clash-free constraint is satisfiable. 

Proof. It is not difficult to show that for any Z G V{p), there is a unique 
sequence of maximal equality sets E\, . . . , that connect the root oi p to Z 
via labeling constraints. From this, we can read off the satisfying tree structure 
and variable assignment in a straightforward way. □ 

It remains to show that we can restrict our attention to simple constraints. An 
extension of a constraint is a constraint of the form p !\ p' for some p' . We 
will show how to extend a saturated, clash-free constraint to a simple, saturated, 
and clash free constraint. 

We define the set cor\,p{X) of variables connected to X in p as follows: 

X<l*Y in p,Y<l*X not in p, not exists Z s.t. 1 
X<\*Z,Z<\*Y in p, Z<\* X,Y <\* Z not in p j 



con,^(X) = S L 
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Note that for this algorithm, X<l*Y not in (p is the same as -iX<i*Y in ip; it 
does, however, make a difference for the second algorithm below. Intuitively, a 
variable is connected to X if it is a “minimal dominance child” of X. So for 
example, in 

ipi := X<|*X A A A X<i*Z A -nZ<]*X A Y<fZ A ^Z<fY, 

cov\^^{X) = {Y} and cov\^^{Y) = {Z}. 

We call V C V{ip) a disjointness set for ip if for any two distinct variables 
hi, >2 G V, Yi<\*Y 2 not in ip. The idea is that all variables in a disjointness set 
can safely be placed at disjoint positions of a tree. 

Lemma 2. If ip is saturated and X G V{ip) then for all Yi,h 2 S zon^p{X), the 
set {Yi,l 2 } is either an equality or disjointness set for ip. 

For a constraint ip and a variable X of ip, LemmaEI implies the existence of max- 
imal disjointness sets V C con,^(X) for ip. Such a set is constructed by choosing 
one representative from every maximal equality set contained in con,^(W). 

Now we can state and prove the key lemma of the completeness proof. 

Lemma 3 (Extension by Labeling). Let ip he a constraint and X a variable 
that is unlabeled in ip. Let {X\,... ,Xn} Q con,p(X) be a disjointness set for 
ip that is maximal among all disjointness sets that are subsets of coriip(X), and 
let f be a function symbol of arity n. If ip is saturated and clash-free, then 
ip A X:f{Xi, . . . ,Xn) is also saturated and clash-free. 

Proof. Let ip' = ip A X:f(Xi, . . . ,X„). Since we have not introduced new vari- 
ables or dominance relations, ip' inherits saturation with respect to the nonde- 
terministic guessing rule, (Reff), and (Trans) and clash- freeness with respect to 
(Clash2) and (ClashS) from ip'. The rule (Decomp) is not applicable to ip'; oth- 
erwise, X would have been labeled in ip. By the same argument, the ( Clash 1) 
rule is not applicable to ip' . The only new way to apply the (Dom) rule is to the 
new labeling constraint; but the dominances (Dom) can derive are already in 
ip. The (Clash4) rule is not applicable, either: No Xi<i*X can be in ip' because 
Xi G con,p(X). The arguments for the remaining rules are more interesting: 

(Disj) The only new way in which the (Disj) might apply is as follows: 

X:f{X,, . . . ,X„) ^ (i yf j) 

By assumption, {Xi,... ,Xn} is a disjointness set for ip. Hence 
->Xi<l*Xj in ip, i.e. ip' is saturated under (Disj). 

(Child) The only possible case in which the (Child) rule might newly apply 
looks as follows: 



AX:/(Xi,... ,X„) A /\(-W<*y) 
2 = 1 
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But since we have chosen {Xi, . . . to be a maximal disjointness 
set of cori(^(X), it follows from X<\*Y in ip that either Y is already 
in if, or there is a 1 < * < n such that Xi<H*Y in tp (compare Lemma 
I2D. Hence, the (Child) rule is not applicable in any new way either. 
(Parent) Finally, the only non-trivial new way in which to apply the (Parent) 
rule looks as follows. 

X,=Z AX:f{... ,X,,...) AY:g{... ,Z,...)^X=Y 

We show that this is not possible either: Xi=Z A Y :g{. . . ,Z,...) may 
not belong to p. We distinguish four cases depending on the positive 
and negative dominance relations between X and Y. 

1. in p: 

a) Y<\*X in p\ This implies that X would have been labeled in 
p which contradicts the assumption of the lemma. 

b) —Y<\*X in p: Because of (Dom) and (Trans), Y<\*Xi in p. 
Saturation under (Trans) and (Clash4) implies ~'Xi<l*Y in p. 
But this contradicts Xi G cor\^p{X), as Y could take the role of 
Z in the second line of the definition of con^(AT). 

2. —<X<i*Y in p\ 

a) Y<\*X in p: We show that all immediate children of Y (i.e. 
Z and all of its siblings in the labeling constraint) do not 
dominate X- then we can apply the (Child) rule to conclude 
X<i*Y in p, which contradicts ^X<i*Y in p, i.e. the clash- 
freeness of p with respect to rule (ClashS). 

Clearly, Z<]*X not in p; for otherwise, (Trans) would imply 
that Xi<\*X in p, which contradicts the conx(<^) condition for 
Xi. So let Z' be a sibling of Z in the labeling constraint above. 
If Z'<\*X in p, then Z'<i* Z in p (by the Trans rule); but by 
the (Disj) rule, ~'Z'<J*Z in p, in contradiction to (ClashS). 

b) -Y<\*X in p: Saturation under (Trans) implies X<l*Z in p, 

and saturation under (Dom) yields Y<i*Z in p. This means 
that (Clash2) is applicable on p, in contradiction to the clash- 
freeness of p. □ 

Corollary 1 (Reduction to Simple Constraints). Every saturated, clash- 
free constraint has a simple, saturated, clash-free extension. 

Proof. Let p be saturated and clash-free. Without loss of generality, p has a root 
variable (otherwise, we choose a fresh variable X and consider p A /\{X<*P jP G 
V{p)} instead of p). 

By Lemma 0 we can successively label all variables in p. The only problem is 
that the signature might not contain a function symbol for an arity we need; but 
we can get around this (artificial) problem by encoding these symbols with a 
nullary symbol and one symbol of arity > 2, whose existence we have assumed. 

□ 
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« Ai 



Bn < X2 




woman 




var„ i Xe 



var„ Alo- 



va r„ 




Ai3 



Fig. 1. An underspecified representation of Every man loves a woman. 



3.3 A More Practical Solution Algorithm 

The algorithm we have just presented is convenient from a theoretical perspec- 
tive, but from a practical perspective, it’s totally useless. Let’s consider an exam- 
ple from scope underspecification for illustration. The constraint graph in Fig. ^ 
represents a dominance constraint describing the readings of the ambiguous sen- 
tence 

(1) Every man loves a woman. 

This constraint has 14 variables; so the algorithm will consider 2^®® or about 
10®® alternatives in the guessing step, which of course is way too much to search 
through deterministically. 

A more practically feasible satisfiability algorithm is inspired by constraint pro- 
gramming 0. We replace the nondeterministic guessing rule (Choice) by two 
distribution rules. The application strategy is to apply propagation and clash 
rules for as long as possible; only when no such rule is applicable, a single appli- 
cation of a distribution rule takes place. 



(Refl) (Trans) (Decomp) (Child) 
(Disj) (Dom) (Parent) 



Propagation rules from Sect. 13. II 



(Clashl) (Clash2) 
(Clash3) (Clash4) 



Clash rules from Sect. 0 



(DistiT) X<*ZAY<*Z X<tY orY<*X 
(Distr2) A<l*yA A:/(Ai,... ,A„) ^ or VUi 



Proving soundness of the modified algorithm is a trivial extension of the original 
soundness proof. The completeness proof has exactly the same structure as the 
one above, but the details of the proofs of Lemmas Q and 0 have to be changed. 
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On the example (^) , the new algorithm will alternate between propagation steps 
and single applications of the (Distrl) rule. The first application of this rule will 
be to and Xg. The algorithm terminates after a total of three applications 
of (Distrl); this means that the search space has a size of at most eight. 

As it stands, the second algorithm can be implemented with reasonable effi- 
ciency, but it’s still not perfect for the application to scope underspecification, 
mainly because it is based on the wrong set of atomic constraints (<l* and “'<*); 
using <*, inequality yf, and disjointness T (two nodes are disjoint if they don’t 
dominate each other) would be better. A more advanced version of the second 
algorithm which takes this into account and runs even more efficiently is defined 
in Pj. 

Alternatively, an implementation of a dominance constraint solver can be based 
on finite set eonstraints. This was first suggested in |3| and is worked out e.g. 
in m- m compares these two fundamental alternatives to implementing dom- 
inance constraint solvers in terms of runtimes and search spaces. An imple- 
mentation of the set constraint based solver can be found on the WWW at 
http : //www . coli .uni-sb . de/ cl/project s/chorus /demo .html 



3.4 Larger Logical Languages 

The algorithms we have defined so far solve dominance constraints that are 
pure conjunctions of labeling, dominance, and negated dominance constraints. 
We now extend them to allow in addition disjunctions and, later, negations 
in formulae over these atomic constraints. Positive occurrences of existential 
quantifiers can always be added, as they make no difference for satisfiability. 
The proofs for this section are simple and will be omitted. 

In a nondeterministic setting, it is easy to deal with disjunctions] all we have to 
do is to go through the formula recursively and guess for each disjunction which 
disjunct can be satisfied. In this way, we produce a conjunction that we can feed 
into the original algorithm. It is easily shown that a formula (p of disjunctions 
and conjunctions over dominance constraints is satisfiable iff there is a choice of 
disjuncts that has a clash-free saturation. 

The only difficulty in handling negations is to deal with negated labeling con- 
straints, as a formula containing negations can clearly be reduced to an equiva- 
lent one where the only negations are single negations of atomic formulae, and 
we already know what to do with negated dominance constraints. We can get 
rid of negated labeling constraints as well by replacing them with satisfiabil- 
ity equivalent formulae that do not contain negated labeling constraints. If the 
signature is finite, we can replace a constraint -iX:f(Xi, . . . ,Xn) by 

V Wg(X(,... ,X'r(,)) I V (w/(X(',... ,X")A 

where the Xi and X” are fresh variables. Now all we need to show is that a 
negated labeling constraint (p is satisfied by a pair (A4 , a) iff its encoding (p' is 
satisfied by (Ad, a'), where a' agrees with a on a’s domain. 





116 



A. Roller, J. Niehren, and R. Treinen 



This construction does not work for infinite signatures since the first disjunction 
would become infinite. Except in pathological cases, however, negated labeling 
constraints can be eliminated in this case as well, at the price of additional case 
distinctions. 

Taking all the results from this section together, we have shown: 

Proposition 3. The satisfiability problem of the positive existential fragment 
over dominance constraints (and, of course, all smaller languages) is in NP. 

4 NP-Hardness of Dominance Constraints 

As we have just seen, the satisfiability problems of all languages over dominance 
constraints which are sublanguages of the positive existential fragment are in 
NP. In this section, we complement this result by showing that even for purely 
conjunctive dominance constraints, this problem is NP-hard. To this end, we 
reduce the 3SAT problem to the satisfiability problem of dominance constraints. 
We only sketch the proof, as the main construction is quite intuitive, and further 
details provide no further insight. Together with the result from the previous 
section, we obtain the following result: 

Theorem 1. The satisfiability problems of all logical languages over dominance 
constraints between the (purely conjunctive) constraint language and the positive 
existential fragment are NP-complete. 

3SAT, a classical NP-hard problem, is the satisfiability problem of propositional 
formulae in conjunctive normal form where every conjunct is a disjunction of 
exactly three literals. This special type of conjunctive normal form is called 
3-CNF. 



formulae •0 = Ci A . . . A Cm 
clauses Ci = LuV Li 2 V 
literals Lij = Xk or -•Xk- 

We assume that the variables that occur in ip are Xi, . . . , Xn- 
The reduction is by encoding formulae in 3-CNF as satisfaction equivalent dom- 
inance constraints. The central problem that we must overcome is to model 
clauses without using disjunctions. We do this by using dominance triangles, 
subconstraints whose graphs look like this: 




If (AT^, a) is a solution of such a constraint, then a must map exactly one of the 
variables X 2 ,X^, X 4 to the same node as Xi because a(X 2 ) must be a prefix of 
a(Xi), which in turn must be a prefix of a{X 4 ). We can exploit this effect to 
model three-way disjunction - just what we need to encode a clause. 




Dominance Constraints: Algorithms and Complexity 117 




Fig. 2. An encoding of {X\ V -1X2 V A3) A (-'Ai V X2 V A3) as a dominance constraint. 



4.1 An Example of the Reduction 

As an example of a 3-CNF formula, consider the following formula ip'. 

( 2 ) (Ai V -A2 V A3) A (-Ai V A2 V A3) 

The constraint graph in Fig. El represents the dominance constraint ip which is 
the encoding of ip. We are drawing the constraint graph in a somewhat simplified 
manner by leaving away all labels of inner nodes and most variable names; all 
inner nodes should be read as being labeled with a fixed binary constructor /. 
The signature we use is {/:2, true:0, false:0}, but the proof remains valid if the 
signature contains more labels. 

We claim that (p is satisfiable iff ip is satisfiable. To understand this, let us take 
a closer look at the various parts of the diagram. 

The lower left part of the graph (below the node S) holds a variable assignment: 
For each of the variables A/^ that occur in ip, there is one node. In a solution, 
each of these nodes must be labeled with either true or false, but not both. 
We can view ip as a, constraint on admissibility of variable assignments by calling 
a variable assignment admissible if it satisfies ip. Each clause imposes such a 
restriction on the variable assignments; within a clause, we have a choice between 
three different options for satisfying the constraint. 

The dominance constraint expresses the very same thing. 

Because it is part of a dominance triangle, Ci must be identified with one of the 
Lij in any solution. But once we have identified Ci with one of the three L\j, 
we have decided which of the clause Ci’s literals we want to satisfy: The right 
daughter of the chosen Lij node is identified with S, some entries in the variable 
assignment subtree may be skipped, and then a value restriction is imposed on 
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one of the variables Xk- In the example, Ln forces the label of X\ to be true; 
Li 2 forces the label of X 2 to be false; etc. We have imposed a constraint on 
the variable assignment that is obviously equivalent to that imposed by the first 
clause in ip. 

The second clause is represented in a similar way: The dominance triangle be- 
tween L21, C2, and L23 allows a choice which literal of this clause we want to 
satisfy. Whichever literal we pick, its right daughter first skips the entry for Ci 
(identifying S with one of the S2j), and then it selects a variable entry and 
imposes a value constraint. 

An important detail of this encoding is the presence of more nodes than just 
the Ci in the main branch of the graph (for example, there are two additional 
nodes between Ci and C2 in the constraint graph). These nodes are “rubbish 
dumps” which can be used to store unneeded material in such a way that it won’t 
interfere with anything else. Suppose we identified Ci and L12 in a solution of ip. 
Then Ln will be identified with the left daughter of Ci, and L13 will be identified 
with the mother of Ci . Clearly, we do not want any other part of the constraint 
to say anything about the right child of Ci’s mother because otherwise, we 
might run into unnecessarily unsatisfiable dominance constraints. This means 
that above each Ci node, we need two additional nodes to drop material from 
the identification process. We do not need any additional nodes below the Ci 
because the unnecessary material is then a left child of the selected literal node 
and can safely be stored below Ci’s left daughter. 



4.2 The Reduction in the General Case 

Now that we have made the intuition clear, we define the encoding in a more 
systematic way. 

We build the constraint graph that corresponds to p from the “building blocks” 
in Fig. 0 Larger building blocks can include several copies of smaller building 
blocks. For most of the building blocks, we have specified with arrows an upper 
and a lower attachment site where it can be composed with other blocks by 
identifying the two attachment sites; we write such compositions as trees whose 
labels are the two building blocks. Furthermore, we take a block with a super- 
script s (such as Skip with superscript j — 1 in W) to mean s-fold composition 
of building blocks. So we want the Xi block to consist of * — 1 occurrences of 
Skip blocks and two additional nodes that are immediate children of the lowest 
attachment site in the sequence of Skips, the left of which is labeled with true. 
It is easy to see that the constraint graph from the previous section was built 
according to this scheme. The overall structure consists of m entries for the 
clauses, below which n Skip blocks hold a variable assignment. Within each 
Ci block, there is a dominance triangle that allows the selection of a literal, 
together with a sufficient number of SkipC blocks to skip lower clauses. Finally, 
the encoding of a literal selects a propositional variable and imposes a value 
restriction. 



Dominance Constraints: Algorithms and Complexity 119 




Fig. 3. Building blocks for the encoding of 3SAT as a dominance constraint. 



The intuitive explanation from the beginning of this section should make clear 
that this encoding is correct. For a formal proof, we can encode valuations satis- 
fying a 3-CNF formula ijj as tree structures and variable assignments satisfying 
the encoding of tp, and vice versa. The gory details of this construction can be 
found in |H|. 

5 The First-Order Theory of Dominance Constraints 

In the sections so far, we have focused on propositional languages over dominance 
constraints. Now we allow all propositional and first-order connectives (over 
the same set of atomic constraints) and consider the validity problem of first- 
order formulae over dominance constraints. We first show a direct proof of 
the decidability of this problem for the case of bounded arity by reduction to 
second-order monadic logic. Afterwards, we show that the problem has non- 
elementary complexity by reducing the equivalence problem of regular languages 
with complement to it. Both results hold true for validity both over finite and 
over arbitrary trees. 
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5.1 Reduction to Second-Order Monadic Logic 

In this section we assume a (finite or infinite) signature with a bound n on the 
arities of the function symbols. We show how to reduce the theory of dominance 
constraints to S{n + 1)5, the second-order monadic theory of n -|- 1 successor 
functions. The reduction carries over verbatim to WS(n + 1)5, so finiteness of 
the trees doesn’t make a difference. 

The language of S(n + 1)5 contains first-order variables x,y,z,. . . and second- 
order monadic variables (i.e., variables denoting sets) X,Y, Z, . . . , a, binary rela- 
tion symbol G, a constant e, and for every 0 < i < n, a unary function symbol i. 
The universe of the corresponding structure is the set {0, . . . , n}* of words over 
the alphabet {0, . . . , n}, where e denotes the empty word, and a function symbol 
i is interpreted as i{w) = wi. The formula cc G X is true iff the denotation of x 
is contained in the denotation of X. The theory S{n + l)S is the set of all closed 
formulae valid in this structure. The theory W5(n-|-1)5 is defined by restricting 
the denotation of monadic second-order variables to finite sets. The decidability 
of these theories has been established in [E] for the case of W5(n -I- 1)5 and [O] 
for the case of S{n + 1)5. A function application i{x) is usually written as x.i. 
We encode tree structures as the denotations of set variables X. Both are sets of 
words, and we can easily express closure under prefix and left brother; the only 
challenge is how to encode labels and arities. Below, we will first write down an 
S{n+ 1)5 formula that characterizes (encodings of) tree structures. Once this 
is done, we can easily encode dominance and labeling constraints. 

We assume that the function symbols of a given arity k are numbered 1,2, .. . 
if there are infinitely many symbols of arity k and 1, . . . ,Ofc if there are only 
finitely many. If there are infinitely many function symbols of arity k, we write 
O/c = oo. 

We encode a tree r as the following finite subset of {0, . . . , n}* : 

Tt = {ttO* I 7T G Dr,i < 'r(Tr)} 

In words, we represent t as a set of words in S{n + l)S by first requiring that 
all words in the tree domain of r are also in Tt. Then we add 0-successors to 
signify the label: The label of the node tt of r is represented by the arity of the 
node 7T together with the length of the string of O’s attached to tt in T,-. That is, 
instead of requiring that the label should determine the arity of a node, we only 
have to make sure that the label which is indicated by the arity and the zeroes 
really exists (i.e. there are at most zeroes). 

We encode a closed first-order formula (p over dominance constraints as a closed 
monadic second-order formula containing a second-order variable X which de- 
notes the encoding Tt of a tree model r. To this end, we first axiomatize the 
general structure of an Tt'. 
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tree{X) =\/x ^ {x.i G X ^ x G X) 



( 3 ) 



A Va; y/y {x.i G X ^ x.{i — 1) G X) 



( 4 ) 



,n 



A Va;((a; G X A -'By x = y.O) -A x.Q G X) 
A Va;( /y x.O.i ^ X) 



( 5 ) 

(6) 



2=1,... ,n 



A Va;(a;.l ^X ^ a;.0“«+i ^ X) 

A Va; /y {x.i G X A x.{i + 1) ^ X — >■ x.Q°“''^^ ^ X) 



( 7 ) 

(8) 



(In this formula, the expression x.{n + 1) ^ X should be read as true.) 

The formula first says that X is a tree, that is, prefixed-closed (formula 0) and 
closed under left brother (formula ^ . By formula El every “proper” tree node, 
that is, a word not ending on 0, has to be labeled. Formula El together with El 
ensures that if xQw G X, then w G {0}*. Finally, the consistency of the label of 
a node with its number of children (in the sense discussed above) is expressed 
by formula Q for the case of nullary function symbols, and by formula El for 
non- nullary function symbols. 

To complete our translation, we need another auxiliary predicate, treenode{x), 
which expresses that a; is a “proper tree node” (i.e. not part of the encoding of a 
label). More exactly, under the assumption that tree{X) holds, we will get that 
treenode{x) holds exactly when x G X and x G {1, . . . , n}* : 



Finally, we encode an arbitrary first-order formula over the dominance con- 
straints as an 5(71-1- 1) 5-formula di>x by 

1. relativizing all quantifiers by the predicate treenode{-); 

2. replacing x<i*y by prefix{x,y); 

3. replacing x:/(xi, . . . , x„) by label f{x, x\, . . . , x„), 

where we define 

label f{x, xi, . . . , x„) = A = - A x.Q7 g X a x.Q7+i ^ X a x.(n + 1) ^ X 



treenode{x) := x G X A -iBy x = y.O 
The prefix predicate is expressed in S{n + 1)5 as usual: 




2=1,... ,n 
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Theorem 2. A first-order formula <l> over dominance constraints is valid in 
the class of all fresp. finite) tree structures iff the following formula is valid in 
S{n+l)S (resp. TT5'(n + 

yX{tree{X) — >• 

Using the signature consisting of a constant a and a binary /, the following 
formula is valid over finite trees but not valid over infinite trees: 

VX, Y, Z{X:f{Y, Z) 3V{V:a A X<]*V)) 

Correspondingly, the following S'SS'-formula is valid in fU5'3S' but not in S3S: 

\/X (tree{X) — >■ Vx, y, z{treenode{x) A treenode{y) A treenode(z) A 
label f{x, y, z) — >■ 3v{treenode{v) A labela{v) A prefix{x, u)))) 



Corollary 2. The theory of first-order formulae over dominance constraints 
with a signature of bounded arity is decidable. 

5.2 The First-Order Theory is Non-elementary 

We recall that a problem has non-elementary complexity if there is no algorithm 
for it running in time bounded by expk{n) for any k, where expo(n) = n and 
expk+i{n) = (see, for instance, [7| for a survey). 

Theorem 3. The first-order theory of dominance constraints has non-elemen- 
tary complexity, both in the case of the finite tree models and in the case of 
arbitrary tree models. 

Recall that we have assumed the signature to contain at least a constant and a 
binary function symbol. We show this theorem by a reduction of the following 
classical problem: 

Theorem 4 (Stockmeyer and Meyer, [18J). The problem whether two reg- 
ular expressions formed with 1,2, concatenation, union and complement (inter- 
preted with respect to {1,2}*^ denote the same set is non-elementary. 

We define our syntax of regular expressions as 

R ::= 1 I 2 I R\J R \ RR \ ^R 

The language defined by the regular expression R is called Cr. 

Given two variables X and Y, we translate a regular expression R of this class 
into a formula g}[R^{X,Y) with free variables {X,Y}. Roughly, (p[R]{X,Y) ex- 
presses that in any of its solutions, X dominates Y, and the path between the 
two nodes is in Cr. 
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^[i](x,y) = 3Z(X:/(r,z)) 

^l2]{X,Y) = 3Z{X:f{Z,Y)) 
nRiuR,] {X, Y) = (X, y) V (y, y ) 

V’[r,r,]{X,Y) = 3Z{^[n^]{X,Z) A ^[r,]{Z,Y)) 

{X, y) = x<3*y A (X, y) 



The following formula says that a tree is (an approximation of) the full binary 
tree. More precisely, if r is a tree satisfying Bin, all of its inner nodes must be 
labeled with / and all of its leaves must be labeled with a. 



Bin = 'iX {X:a V 3y3Z X:f{Y, Z)) 



Proposition 4. For any tree strueture M. and variable assignment a, 

M, a ^ Bin A {X, Y) iff exists v € Cr : a{Y) = a{X) ■ v. 

Proof. This is proven by induction on R. The only non-trivial case is the com- 
plement. Let (A4,a) satisfy Bin. Then 
M,a^g>[^R](X,Y) 

M, a ^ X<i*Y A ~'^P[R] {X, Y) by definition 

a(Al)<l*a(y) and -i3r; e Cr a(Y) = a(X) ■ v by induction 
^ 3v G £i-.R]a{Y) = a{X) ■ V (*) 

The step (*) is justified by the fact that in a solution of Bin, a(Al)<l*a(y) iff 
there is a word v G {1, 2}* with a{Y) = a{X) -v, and that such a f is unique. □ 

We can finally reduce the equivalence problem of regular expressions to the 
theory of dominance constraints: 

Lemma 4. Let R\ and i ?2 be regular expressions, I the class of infinite tree 
models and T the class of finite tree models. Then we have that 

£ri = £r2 ( 9 ) 

Bin ^ (VXVy(^[fi,](X,y) o V[r,}{X,Y)) (10) 

^T^Bin^ (vyvy((^[fl,](W>") ^ Tm{X,Y) (11) 

Proof. (EJ ^ (fTTill : If Cr^ = Cr^ then, by Proposition 0 the formulae and 
if[R 2 ] are equivalent in any model of Bin. 
dnn ^ (DU) : Immediate since T G_X. 

(-ED ^ (-d : Let V G Cr^ — £ri, and let r be the binary tree of depth 
length{v); that is, all nodes of depth < length{v) are labeled with /, and all 
nodes of depth length{v) are labeled with a. Xi'^, together with the variable 
assignment {X i— >■ e,y i— >■ v}, satisfies qi[Rp^ but not 'PiR^]-, as a consequence of 
Proposition 0 □ 
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6 Conclusion 

In this paper, we have analyzed the complexity of various logical languages over 
dominance constraints. We have shown that the satisfiability problems of all 
languages between the purely conjunctive constraint language and the positive 
existential fragment are NP-complete. We have presented two algorithms for 
solving dominance constraints, the second of which has been implemented and 
works well for real-world examples. Finally, we have presented a new proof of the 
decidability of the first-order theory of dominance constraints with signatures 
of bounded arity by encoding it into {W)SnS, and we have shown that its 
complexity is non-elementary. 

From a practical perspective, the most important of the logical languages over 
dominance constraints we have considered here is the purely conjunctive con- 
straint language. Despite the NP-hardness result we have derived for this lan- 
guage, there are implementations that decide satisfiability very efficiently for 
constraints from scope underspecification. These implementations are either ad- 
vanced variants of the second algorithm presented here or based on finite set 
constraints; either way, they follow the “propagate and distribute” strategy ad- 
vocated by constraint programming. 

The observation that the general intractability does not affect the behaviour 
of actual implementations suggests that the linguistically relevant dominance 
constraints all belong to a subclass with an easier satisfiability problem, and 
that our algorithm exploits this automatically. The NP-hardness proof in this 
paper can be invalidated by only considering normal constraints, which contain 
an inequality constraint between any two variables for which they contain a 
labeling constraint; but inequality constraints can be used to build another NP- 
hardness proof. So an exact characterization of this subclass is an open problem. 
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Abstract. The principle of compositionality, as standardly defined, re- 
gards grammars as compositional that are not compositional in an intu- 
itive sense of the word. There is, for example, no notion of a part of a 
string or structure involved in the formal definition. We shall therefore 
propose here a stricter version of compositionality. It consists in a con- 
junction of principles which assure among other that complex signs are 
in a literal sense made from simpler signs, with the meaning and syntac- 
tic type being computed in tandem. We shall argue that given this strict 
principle, quite powerful string handling mechanisms must be assumed. 
Linear Context Free Rewrite Systems (see lldl l are not enough to gener- 
ate human languages, but most likely Literal Movement Grammars will 
do. 



A grammar is compositional if the meaning of a (complex) expression is de- 
termined from the meaning of its (immediate) parts together with their mode 
of combination. A language is compositional if there is a compositional gram- 
mar that generates it. (See j^I for a discussion.) Recently, Zadrozny has 
presented a proof that any meaning assignment function can go with a com- 
positional grammar. This proof has been dismissed by Kazmi and Pelletier 0 
and Westerstahl m on the grounds that Zadrozny is computing the meaning 
through a detour. He introduces new functions, one per word, and reduces the re- 
quirement of compositionality to the solution of an equation, which always exists 
if one assumes the existence of non-well founded sets. In fact, Zadrozny him- 
self finds this solution formalistic and proposes restrictions under which certain 
pathological examples are excluded. So, there is an agreement that one should 
not count the proof by Zadrozny as showing us anything about the problem of 
compositionality of language — if the latter is understood in an intuitive sense. 
But in what sense can or must compositionality be understood if it is to have any 
significance? And what is the source of the complaints that people like Kazmi, 
Pelletier, Westerstahl (and others) raise? In this paper we will try to elaborate 
a formal setup of language as a semiotic system that allows to discuss this prob- 
lem in a meaningful way. We shall propose a definition of compositionality, called 
strict compositionality, which is restrictive and therefore non vacuous. 

M. Moortgat (Ed.): LACL’98, LNAI 2014, pp. 126-^3 2001. 
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Before we enter the discussion, we need to provide some basic definitions. 
Let F be a set (possibly empty) and Q a function from F to the set lo of natural 
numbers. Throughout this paper, F will always be assumed to be finite. The 
pair {F, fi) (often denoted just by fi) is called a signature. A (partial) [2-algebra 
is a pair 2t = {A, I), where A is a non-empty set (called the carrier set of 21) 
and / is a function with codomain F, which assigns to each f G F a (partial) 
f2(f)-ary function on A. A homomorphism from a 21 to some (partial) 17-algebra 
iB = (B, J) is a function h : A ^ B such that for all / G F and all a G 

h{I(f){a)) = J{f)(h{a)) . 

(Here, h(a) = {h(ao), . . . ,h{aQ(^f-^_i)).) This means that the left hand side is 
defined iff the right hand is defined, and if both sides are defined they are equal. 
If F C A and h is the identity map we speak of $ as a subalgebra of 2t. In that 
case one can show that J(/) = 1(f) ( B^^^\ Let A C A be an arbitrary subset 
of A. We denote by [A] the least subset of A that contains A and is closed under 
all partial functions 1(f). If A has cardinality n and [A] = A, we say that 2t is 
n-generated. In particular, 21 is 0-generated if A = [0]. 

Fix a set V := {xi : i G w}. An f2-term is defined inductively as fol- 
lows. Every Xi G V is a term; if / G F and ti, i < [2(f), are terms, so is 
/(to, ti, . . . , ti 7 (/)_i). Now, given an algebra (A, I), the set of polynomials is the 
set of l7yi-terms, where [2a is obtained from 17 by adding a nullary function 
symbol a for each a G A, which is interpreted by a in A. 

We assume that a language is a set of signs. In accordance with the syntactic 
literature, a sign is a triple, consisting of an exponent (this is what you can 
actually see of the sign) , a type, and a meaning. 

Definition 1 Let E, T and M be (nonempty) sets. A sign over (E,T,M) is 
a member a of the Cartesian product E x T x M. If a = (e,t,m), we call e 
the exponent of a, t the type of a and m the meaning of a. A language over 
(E,T,M) is a set of (E,T,M) -signs. 

We denote the first projection from a sign by e, the second projection by t and 
the third projection by pL. Given a language L, then e[L] := {e(cr) : ct G F} is 
the set of exponents of L. When there is no risk of confusion we shall also speak 
of e[L] as a language. This covers the traditional usage of a subset of A* being 
a language. Here are some examples of signs. 

‘a’ : (a, np/n, X?. \Q.(3x)(7’(x) AQ(x))) 

‘man’ : (man, n, Acc.man' (a;)) 

‘walks’ : (walks, np\t, Ax. walk' '(a:)) 

Notice that the name of the sign (eg ‘man’) can be anything, even a number. 
What we can actually see of the sign is its exponent, ie man. (We write in type- 
writer font the exponent of a sign. The symbols in typewriter font are therefore 
true letters in print, while any other symbol is only proxy for some letter or 
string. Typewriter fonts are our device for quoting a string in running text with- 
out having to use quotation marks. Notice that ‘man’ is used to refer to the 
complete sign rather than its exponent.) 
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Definition 2 A {E,T,M)~ grammar G consists of a finite signature Q and 
functions Ie, It and Im such that € := {E, Ie), T := (T, It) and := (M, Im) 
are partial 12-algebras. The language of G, L{G), is defined by L{G) := [0]. We 
call E the set of modes of G. A mode is proper if it has nonzero arity. The 
lexicon of G is the set {{IeU), Irif), Inif)) ■ 12(f) = 0}. 

This definition needs some exegesis. We think of E, T and M as independent sets, 
given in advance. At present, there are no conditions on these sets. A language is 
any subset oi ExTx M. A grammar is a system by which the signs of L are built 
from some finite set of signs (the lexicon) by means of some finite set of functions 
(called proper modes). Therefore, modes correspond to grammatical rules. For 
example, if / is a binary mode, one thinks of / as the following grammatical rule 



/(cri,cr2) CTi CT2 



Notice that terminal symbols of a grammar are generally not considered to be 
rules (they are part of the lexicon), but there is no harm in thinking of them as 
rules of the form cr — >■ . (think of the arrow as the := in Prolog). Notice that 
our definition of grammar is modular: not any set of functions from signs to 
signs qualifies. Rather, we assume that the modes operate independently on the 
exponents, the types and the meanings of the signs. Therefore, in order to define 
a grammar, one needs only to specify the interpretation of the modes in each of 
the three sets E, T and M independently. This defines the algebras €, T and 
911. The rest is completely determined. One forms the product 2: x T x 9Jt. This 
algebra contains a unique minimal subalgebra. Its carrier set is precisely the set 
of all signs that can be produced from the lexicon by the set of proper modes. 
If L has a grammar, it is also a partial 17-algebra £ in some canonical way. We 
should note that if L is a (E, T, M)-language, it is also a (A', T', M')-language 
for any E' Z) E, T' 0 T and M' Z) M. However, there is a different sense of 
extension of a language that will play a role in the discussion. 

Definition 3 Let L be a (E,T.M) -language and L' a {E' ,T' , M') -language. L' 
extends L if L'r\ExTxM = L. 

If L' extends properly L, one may think of L' as having hidden signs. Such a 
sign is of the form (e, t, m) where either e^E or t^T or M. It may be 
difficult to argue for a sign with exponent e G if to be hidden. Anyhow, we shall 
argue below that one should not postulate hidden signs. But for the moment, 
we leave this possibility open. The following definition embodies the notion of 
compositionality as employed in the literature. 

Definition 4 Let L be a {E,T, M) -language. L is naturally compositional if 
there is a {E,T,M)- grammar G such that L{G) = L. L is compositional if 
there is a L' which extends L and is naturally compositional. 

Notice that the notion of extension allows the introduction of new exponents 
or new types or new meanings if necessary. However, no new (if , T, M)-signs 
may be introduced. To see why this is the standard approach, notice that the 
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problem is stated as follows. A language is defined a subset L of if x M. The pair 
(e, m) corresponds to a sentence e with meaning m. With this datum, a grammar 
is written, introducing new types (for intermediate constituents) and assigning 
arbitrary meanings to signs containing them. However, in natural languages the 
natural parts of a sentence — the constituents — do have their own meaning, 
and so we cannot choose that meaning arbitrarily. Moreover, the meaning of the 
constituent contributes to the meaning of the sentence. Furthermore, a context- 
free grammar is often defined as a string generating device of the following 
form. The symbol S is the start symbol. Starting with S, one may replace step 
by step each nonterminal by the right side of an appropriate rule. In our view, 
however, this approach suffers from a confusion between the type of a sign and its 
exponent. The symbol NP is not part of the string-algebra, which contains only 
sequences of terminal strings. Hence, NP is a type. In the course of a derivation 
we actually derive triples (a;, NP, m), where a; is a string of type NP and meaning 
m. 

Nevertheless, if the set of types is finite, £ may in fact be thought of as a 
many sorted algebra over sound-meaning pairs. However, the introduction of the 
types has the advantage to eliminate having to type the signs, and it allows to 
incorporate grammars with explicit type constructors, such as categorial gram- 
mars. If • is a mode, we write -g in place of 't in place of /t(’) and -m in 

place of By our definitions, • is defined on a tuple of signs exactly when all 

three functions -g, -t and -m are defined on the corresponding projections. This 
means that the partiality is introduced by the algebras of exponents, signs and 
meanings. One fundamental assumption is made. 

Feasibility. If • is a mode, then the corresponding functions -g, -t and -m 
are computable. 

Moreover, in Montague Grammar the functions are actually polynomials in the 
basic functions of the algebras of strings, types and meanings, respectively. We 
shall assume the same here. This means that they can be expressed in A-notation, 
but we shall suppress A-notation whenever possible. 

Polynomial Feasibility If • is a mode, then the corresponding functions 
•g, •( and -m are polynomial functions. Moreover, each of the functions 
is computable in polynomial time. 

Notice that it is not clear that a polynomial function is computable in polynomial 
time. If, say, SEJl consists in the domain lu of natural numbers and a unary function 
/ : w — >■ w which is not computable, then / is a polynomial function but is 
not polynomially computable. The results rely only partly on this restriction of 
polynomial time computability. 

In the simplest case, the grammar has only one mode, •, called the merge. It 
is given by the functions *g, •( and •m, which are defined by 

e{ai*a2) = e((Ji) *g e((j2) 
r((Ti*cr2) = t((T 2) r((72) 

^(<Ti*(72) = m(o’i) *m M(ct2) 
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Let *e be concatenation of strings with a blank (□) inserted, *t be slash- 
cancellation and •m be function application, and we get Montague-Grammar. 
To give an example, we may compose the signs ‘a’ and ‘man’. This gives the sign 
a := ‘a’ • ‘man’. Now, 

e{a) = e(‘a’) *e e(‘man’) 

= a^D^main 
= a man 

Likewise, t(ct) = np and fi{a) = AQ.(3a;)(man'(x) A Q(x)) are established. Hence 
we get 

(T = (a man, np, AQ.(3a;)(man'(x) A Q(a:^))) 

A word on notation. We shall assume that the exponents of signs are strings 
or sequences thereof. These strings are represented in exactly the same way as 
they are written. So, each word is actually already a sequence (of letters) and 
concatenation of words is done by putting a blank in between the two. We use 
□ to denote the blank. Plain concatenation is denoted by or, if no confusion 
arises, it is denoted simply by concatenation. Word concatenation is denoted by 
•. We have x ■ y = x^\3^y. 

Suppose that G generates L. Since L = [0], any sign of the language can be 
represented by a constant term. This term is called its representing term. Terms 
uniquely designate derivations. So, if for a given exponent x there are m terms 
representing a sign with exponent x then x is said to be m-fold structurally 
ambiguous. The reader should bear in mind that the term denotes the signs only 
with the grammar being given. 

The definitions allow that the exponents of signs may be anything we please. 
Yet, if by exponent we mean the visible (or audible) exponent of the sign, there 
is little sense in assuming that the exponents of signs are anything but strings. 
This is, with a minor adaptation, what we shall assume here. It has been shown 
that given any recursively enumerable language and any computable function 
assigning meanings to strings, there is a type assignment and a compositional 
grammar generating this language (see jO) for details) . However, there is a sense 
in which some of these grammars may fail to be compositional in an intuitive 
sense. We shall therefore say that a grammar G is a strict grammar of L if it 
satisfies the following two requirements. (Two other principles will still follow, 
but we assume them to follow from the next two, if not in the literal sense, then 
at least in spirit.) 

Naturalness. L{G) = L. 

Analyticity. There is a finite set A such that EGA*. For each mode / 
and each sequence cr on which / is defined there is a string polynomial 
p{x), in which each Xi, i < G{f), occurs at least once, such that 

fe{s{(T)) =p(e(cr)) 

(Here, e(cr) = (e(<Ti) : i < !?(/)). Notice that /e(e(cr)) = e(/(cr)), by the 
assumption that / is defined on cr.) 
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Definition 5 A language L is strictly compositional if it has a strict composi- 
tional grammar. 

These principles need a certain amount of motivation. The principle of Natural- 
ness is stronger than requiring that L{G) extends L' , which is commonly used 
in the definitions of compositionality. If G is natural for L, L is actually natu- 
rally compositional, as defined earlier. So, notice that we require that all signs 
that the grammar generates are part of the language. In particular, the signs 
generated have exponents that are units of the language, and the grammar as- 
signs those types and meanings to them that they have in that language. What 
signs there are is of course an empirical question. It is to a large extent also a 
question of methodology or personal persuasion. Not everyone will accept that 
A-terms count as meanings for the words of natural language. However, even 
if the existence and nature of signs is unclear the question of what they are is 
not without meaning at all. Just like Augustinus observed with respect to time, 
there is a perennial problem with respect to meaning. It seems that we know 
perfectly well what meaning is, but when asked to give a definition, we fail. The 
present discussion is therefore not targeted at the question of what signs there 
are; rather, granted that we know that, how can a grammar for them look like? 
To emphasize our point once more: if the question whether languages are com- 
positional is to have any nontrivial meaning at all, it is because we exclude the 
introduction of new signs. 

The Principle of Analyticity means the following. If a sign a is directly de- 
rived from ai, i < n, then the exponents of the Oi are disjoint parts of the 
exponent of a. We have phrased this using the language of string polynomials. 
It should be stressed that the polynomial may depend on the mode as well as 
the sequence of signs. However, if it is independent of the choice of the signs, we 
call the grammar uniformly analytic. 

Definition 6 A grammar is uniformly analytic if for every mode f there exists 
a string polynomial p, in which each variable Xi, i < G(f), occurs at least once, 
such that for all sequences cr of signs on which f is defined the equation 

fe(s((T)) =p(e(cr)) 



holds. 

However, the interpretation of the mode in A* , /e, need not be identical to p, 
it may just be a subset of p (since /e may be a strictly partial function). In 
fact. Uniform Analyticity comes down to the existence of a polynomial p of the 
appropriate kind such that fe=p\ dom(fe). 

Example 1 . To illustrate our point we shall discuss the example of Kazmi 
and Pelletier in | 7 ]. Let A = {a, b, c}, be the alphabet and let the set of expo- 
nents be {a, b, c, ca, cb}. Now we introduce a meaning assignment, which has 
the property that ^(a) = /r(b) but /i(ca) /r(cb). Now, suppose first that there 
is only one type of expression. Then it is easily seen that there exists no strictly 
compositional grammar for that language. This is independent of whether we 
assume Polynomial Feasibility. For there simply is no function whatsoever 
that satisfies 
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^(ca) = ^(c)-^/i(a) 

^(cb) = ^(c)-^/i(b) 

So, restricting the class of functions to exclude this example — as Zadrozny pro- 
poses — is not necessary at all under the assumption of strict compositionality. 
We shall briefly outline two alternatives to the present example to shed light on 
the kind of restrictions that are operative here. First, we could have said that a, 
b, c, ca and cb are the exponents of primitive signs and that there is no proper 
mode, just five 0-ary modes. This is unintended in the example above, since we 
write ca to imply that the sign with exponent ca is composed from the signs 
with exponents c and a, respectively. Notice however that many languages have 
words that can be segmented as strings into ‘unnatural’ parts, in which case the 
impossibility to have a compositional account of the meaning of that word does 
simply not arise since the word as a sign is not conceived of as consisting of these 
parts. To give an example, the word ‘selfish’ is not the result of composing ‘sell’ 
and ‘Ash’, although that (almost) sounds the same. Also the word ‘caterpillar’ 
is not a compound built from ‘(to) cater’ and ‘pillar’. Likewise, idioms must be 
considered as units from a semantic point of view. Evidently, they can often also 
be read literally (in which case they are decomposed), but this is not at issue 
here. Thus, the notion of a basic sign (ie a 0-ary mode) is different from the 
notion of a sign having a nonsegmentable exponent. A second variation on that 
theme is to allow the same string to have several types. We could, for example, 
allow any string to denote itself when of type, say, s. In that case we have two 
signs with the same exponent, for example 

(a,t,^(a)),(a,s,a) 

In this situation, a compositional grammar in the strict sense once again exists. 
Just assume two binary modes o and •: 

{x, s,m) o (y, s,n) = {xy,s,m^n) 

{x,s,m) • {y,s,n) = {xy , s , n)) 

So, we can either compose two strings qua strings, in which case we do concate- 
nation on the semantic side. Or we can compose strings qua meaningful entities 
— in which case we shall step from the string to the corresponding meaning. 
(Actually, • alone would have sufficed here. One can also introduce a unary mode 
that transports a string to its meaning.) In this way, once again a compositional 
grammar exists for any language whatsoever. Or so it seems. What should be 
observed, however, is that we are not giving a grammar for the same language, 
since a language is a set of signs, and we have expanded the set of sings to include 
strings of type s. But how about human languages? We claim that in human 
languages it is not possible to have a string denote its usual meaning in addition 
to denoting itself. In the latter case we call the string quoted. There exist devices 
that allow to quote a string qua string, so actually we do have strings as mean- 
ings in our language. (Just anything can be the meaning of an expressions.) But 
the string x can never denote itself (ie x) . Rather it must occur in a context that 
quotes it. For example, we distinguish in writing between man — the exponent 
of a sign whose meaning is, say, Ax.man'(a;) — and ‘man’ (using object language 
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quotation marks here), the exponent of a sign whose meaning is man. So, it is 
never x itself that denotes x but, for example, the string x enclosed in quotes. 
Of course, it is only natural for a semiotic system not to allow for the possibility 
that its signs are all self-denoting, which would mean that we probably didn’t 
need the signs in the first place. And precisely this saves us from vacuity. 

The Principle of Analyticity is needed to be able to talk meaningfully about 
parts of a sign. Hence, we can talk about constituents and of a constituent anal- 
ysis. As the principle stands, it is not unproblematic. First, we shall have to 
talk of occurrences of strings. The Principle of Analyticity is to be understood 
to require that e(cr) is made from the e(cTj), and that each e{ai) occurs at least 
once. So, there is no deletion of any material whatsoever. Assuming for the mo- 
ment that the exponents are strings, the function -g shall simply be a semigroup 
polynomial in n-variables such that each variable occurs at least once. Examples 
are 

p{x, y) := xxy, q{x, y) := yxya 

Here, a is a certain constant (the letter a). The Principle of Analyticity does 
allow for empty categories. However, we shall apply Occam’s Razor and assume 

Nonemptyness. No sign has empty exponent. 

Now follow some more examples of grammars. 

Example 2. Numbers are written as sequences of digits, where a digit is 
a member of {0,1,..., 9}. The sequence Xk-iXk -2 ■ ■ ■ xq represents the num- 
ber E^<k p,{xi) ■ 10% where yL{xi) is the number assigned to each digit. There 
are therefore two types: digits {D) and sequences (S). The following is a strict 
grammar. These are the nullary modes: 



‘0’ 


: (0,70,0) 


T’ 


: (1,70,1) 


‘9’ 


: (9,70,9) 



There is a unary mode * and a binary mode o. *e(x) := x, *rn{y) '■= M, and 
*t{D) := S. *t{S) is undefined. For the binary mode o we have x Og y ■= xy, 
S ot D \= S. Og is undefined otherwise, m n := 10m + n. So, the term *‘7’ 
corresponds to the sign (7, S, 7), the term ‘2’ o (‘3’ o ‘7’) to the sign (237, S, 237). 

Since the algebra of types is finite, one can actually present the algebra of 
signs in form of the following context-free grammar. 

(0,70,0) I ... I (9,i0,9) ^ . 

{xy, S,W ‘ 1 X 1 + n) — >■ (x,S,m) (y,D,n) 

(x,S,n) — >■ (x,D,n) 



(Here, | denotes an alternative; the first line abbreviates therefore a total of ten 
rules.) Terminal symbols are denoted here by nullary rules. This grammar is 
left-regular. 
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Example 3. As above. However, assign to a sequence x as meaning the pair 
v{x) := (^(x),/r(x)). 

(0,A(1,0)) I ... I (9,i?,(l,9)) ^ . 

(ycCjT, (£i +^2,10 -toi + 7712 )) -)> {x,T,{li,mi)) (y, (£ 2 , m 2 )) 

{x,T,{i,n)) {x,D,{£,n)) 

(x,S,n) -)> {x,T,{£,n)) 

This is a compositional grammar. However, it is not strict. The problem is that 
the meaning of a string x simply is the number it represents, we are not al- 
lowed to add any more to it. It can be shown that there is no right-regular 
strict grammar that generates the number sequences. For a proof note that the 
sequences 7, 07, 007 etc all have the same meaning, namely 7. However, the 
result of prefixing 1 is different in all cases. We get 17, 107, 1007 etc, which all 
represent different numbers. However, since we have only finitely many types in 
a right-regular grammar, some of the sequences 7, 07, 007 etc must have equal 
type, and therefore the grammar generates two different sequences starting with 
1, which are assigned the same number. This is, however, incorrect. 

It is tempting to conclude that a strict grammar for the number sequences 
must be left regular. This is more or less correct, but there are infinitely many 
grammars that can generate the number sequences compositionally even in the 
strict sense and each is different in the constituent analysis that it introduces. 
Nevertheless, the constituent analysis cannot be right regular. Strict grammars 
must satisfy the condition that not both a string x and Ox are generated having 
identical type. So, even if the constituent analysis is not uniquely defined by the 
principles laid out above, there nevertheless are certain things that can be said 
about possible constituents. 

The present definitions do not say anything about the size of the algebra of 
types. It may be finite or infinite. However, notice that if a grammar has in- 
finitely many signs then the same exponent can in principle have infinitely many 
meanings depending on which type it has. Moreover, strings can be infinitely 
ambiguous, simply because they can be derived from some the same string us- 
ing some unary modes. Since we do not want to exclude the number of types 
to be infinite, we shall rather require that unary modes must also change the 
exponent. Although this does not follow from Analyticity in a literal sense, it is 
nevertheless a principle of the same sort, since it requires that every step leaves 
a visible trace on the exponent. 

Productivity. If cr is composed from t by applying a unary mode, then 

s(t) is shorter than s(a). 

This means that there can be no unary rules that only change the category 
(type) and the meaning; rather, a unary mode must introduce material into the 
string. The grammar in Example 2 does not comply with this restriction. There 
is an easy fix for that. Just assume the following additional 0-ary modes: 



‘0#’ 


: (0,5,0) 


‘1#> 


: (1,5,1) 


‘9#’ 


: (9,5,9) 
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Now eliminate the last rule and add instead the rule 

{xy, S,lQm + n) — >■ {x,D,m) (y,D,n) 

This looks like a bad trick but is in fact quite logical. The digit 3 for example, is 
actually the exponent of two signs, that of the digit 3 and that of the sequence 
consisting of 3. This grammar reflects the dual nature of sequences of length 1. 
Sequences of length > 1 can of course not be digits. 

There is plenty of evidence that in language there are empty signs and also 
non productive modes. However, their use must obviously be highly restricted 
otherwise the determination of the meaning from the sound can become infea- 
sible. So, when one looks closely at the matter it often enough appears that the 
use of empty signs and non productive modes can be eliminated in much the 
same way as it can be done in context free grammars. 

Example 4. This example concerns the English number names, in a slightly 
simplified form. This example goes back to Arnold Zwicky, for a discussion of the 
formal complexity of English and Chinese number names see mi- Each language 
has a largest primitive name for a number. This number varies from language 
to language. Let us assume that it is million for English. Numbers of the form 
10®'= are represented by the fc-fold iteration of the word million. The number 
2000003000005 is therefore represented by the string 

two million million three million five 

We shall leave out the words thoussind, hundred as well as ten, twenty etc and 
assume that our alphabet consists only of 

{zero, one, two, . . . , nine, million} 

Legitimate expressions have the following form: 

Xq ■ million'=“ • X\ ■ million''^ • . . . • Xm-i ■ million''™”^ 

where > ki > ... > k^-i and Xj yf million for all j < m. This sequence 
represents the number 

m— 1 

E • 10''^ 

i=o 

This language is not generable by Linear Context Free Rewrite System (oth- 
erwise known as LCFRSs); in fact, it is not even semilinear. However, it is 
recognizable in polynomial time. We shall propose two quite similar grammars. 
Each of the two have the following 0-ary modes: 



‘0’ 


: {zero,D,0) 


‘0#’ 


: (zero, S', 0) 


T’ 


: (one, D, 1) 


‘1#’ 


: (one, S, 1) 


‘9’ 


: (nine,L),9) 


‘9#> 


: (nine, S, 9) 


‘m’ 


: (million, M, 10®) 


‘Qt’ 


: (zero,SE, 0) 






qt- 


: (one, SE, 1) 






‘9t> 


: (nine,SE, 9) 
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Here, D is the type of a digit, S that of a number expression, and SE that of 
a simple expression, where a simple expression is of the form x ■ million^', x a 
digit. Now, there is a binary mode(s) • and operate as follows: 

(x,SE,m)» (y,M,n) := (x-y,SE,m-n) 

{x, SE,m) {y, M,n) := (x-y,S,m-n) 

Finally, the mode o is defined as 

(x, S, m) o (y, SE, n) := {x ■ y, S,m + n) 

However, as the definitions stand the grammar generates all sequences of simple 
expressions, not necessarily in decreasing order. To implement the latter restric- 
tion, we have two choices: (a) we define to be m n := m + n if m > n, 
and m n undefined otherwise, (b) we define xo^y := x-y if x ends in a larger 
block of million than y, and xo^y is undefined otherwise. The proposals differ 
slightly. If we disallow expressions of the form zero million^ then they are ac- 
tually equivalent, otherwise option (a) gives incorrect results. The trouble with 
both proposals is that we need to introduce strange partial functions. However, 
notice that the definition of a grammar did not tell us what the basic functions 
of the partial algebras are. So, rather than taking a simple merge as the (only) 
basic operation, we may also take merge functions as basic which require prop- 
erties of strings to be checked. This requires careful formulation. We shall have 
to require that these functions be computable in polynomial time, otherwise 
Theorem 0 below is actually incorrect. In this case, if we are operating on the 
string algebra, testing whether one string is a substring of the other certainly 
takes only polynomial time. We come to our main definition: 

Definition 7 A grammar is strictly compositional ( or strict ) if it is analytic, 
polynomially feasible, has no nonempty signs and is productive. A language is 
strictly compositional if it has a strictly compositional grammar. 

The present restrictions can be shown to be nontrivial. Before we engage in the 
proof we have to fix a last detail. We shall assume that the algebra operates 
not necessarily on strings but on sequences of strings. Moreover, these sequences 
shall have bounded length, say k. It is beyond the scope of this paper to motivate 
exactly why we depart from the ideal model that the exponents are strings. There 
are basically two arguments: (a) if we allow only strings then there does not seem 
to be a feasible algorithm to generate those languages that are not context-free, 
(b) strings are not continuous but are naturally segmented (eg by pauses). Of 
course, it is always possible to introduce a boundary marker into the string 
which would function the same way. We have felt it technically more clean to 
allow sequences of strings. The relation between the sequences of strings and the 
strings themselves is fixed by the following requirement. 

Vectorization. The string associated with an exponent {xi : i < n) is its 

concatenation XqXi . . . x„_i. 

Hence, we shall assume that the exponents of the grammar depart only mildly 
from strings, namely, they are allowed to be strings with some gaps. We call a 
grammar vectorized if it uses sequences of strings rather than strings. 
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Theorem 8 Let ^ be a strictly compositional vectorized grammar for L. Then 
the following holds: 

1. Given a string x it can be decided in polynomial time whether it belongs to 
the language. 

2. Given a string x, there are at most exponentially many derivations for x. 

Proof. The algorithm is an adaptation of the chart method. First, notice that 
any derivation has length at most |a;|, where |a;| denotes the length of x. A 
representing term for x has therefore at most |a;| mode symbols. It is now easy 
to see that x is at most exponentially ambiguous. Now for the first claim. Let 
n := \x\. By analyticity, the sequence {xi : i < k) must consist of disjoint 
substring occurrences of x. (This is easily established by induction.) There exist 
less than substrings, and hence less than fc-sequences of substrings. In 
the first step, try to match a sequence against the exponent of a 0-ary mode. 
This takes polynomial time. This gives the list Lq. Now, given Li, let Li+i be 
the result of applying all possible modes to the members of Li and matching 
the result against x. x corresponds to some exponent of a sign in L iff there is 
a, a € Li for some i < n such that the exponent (or its product) is x. Now, 
computing from Li takes time polynomial in n. For there are at most 
members, and there are finitely many modes. Each application of a single mode 
takes polynomial time (by polynomial feasibility). We have to compute Li only 
for i < n. This concludes the proof of the first claim. Q. E. D. 

(The polynomial bounds computed here are rather bad. They suffice for the 
argument, however.) So, strictness is restrictive. The question therefore arises: 
which languages are strict and which ones are not? In particular: are natural 
languages at all strictly compositional in the sense of the definition? We believe 
that the answer to the second question is positive. Notice that the rather tight 
constraints on putting together signs do not allow to dissociate the meaning com- 
position and the string handling. For example, devices such as a Cooper-storage 
are inadmissible since they dissociate the syntactic structure from the semantic 
structure by introducing new meanings, something which is strictly prohibited. 
It may well be that Categorial Grammar takes us out of that problem by asso- 
ciating enough types with a string to cover its potentials in a context. If we are 
not so happy with having infinitely many types, however, the only way to sur- 
round this is to postulate stronger string handling mechanisms. One promising 
proposal that has been made recently are the so-called Literal Movement Gram- 
mars (LMG) as have been introduced by Annius Groenink in |^). An LMG has 
rules of the following form 

^(7) Bq(So) Bi(Si) . . . Bn-l(Sn-i) 

where A and the Bi are nonterminals, here called types, and 7 and 5i sequences 
of polynomials over the alphabet, possibly using variables. Since there are only 
finitely many rules, the length of these sequences is bounded. We may therefore 
assume that they all have the same length, adding empty strings if necessary. 
The advantage of LMGs is that they add explicit rules for handling the strings. 
By adding a third dimension, the meaning, we can turn an LMG into a grammar 
of signs. We call an interpreted LMG a grammar that has rules of the form 
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^(7: ?(mo, ■ • ■ , Mn-i)) Bo(So, fio) Bi(Si,fii) . . . 

where are variables for meanings and g is a polynomial function Oil" — >■ 2H. 
An easy example of an LMG is the following 

S{xx) — >■ S{x); iS'(a) — >■ . 

This grammar generates the language {a^ : n € w}. It is shown in pj that 

any recursively enumerable language can be generated by an LMG. Therefore, 
these grammars are very powerful. However, certain subclasses can be identified. 
According to |^, an LMG is hottom up nonerasing if each variable on the right 
hand side occurs at least once on the left hand side. An LMG is hottom up linear 
if each variable on the right hand side occurs at most once on the left hand side. 
An LMG is noncomhinatorial if each term on the right hand side consists of a 
single variable. Finally, an LMG is simple if it is hottom up linear, hottom up 
nonerasing and noncomhinatorial. Now the following holds 

Theorem 9 (Groenink) Let L C A* he a language. L is PTlME-recognizahle 
iff it can he generated hy a simple LMG. 

Now, this does not mean of course that a PTIME-recognizable sign grammar 
can be generated by a simple interpreted LMG. But we shall present evidence 
below that natural languages can be generated by (more or less) simple LMGs. 
Notice that in the definition of hottom up nonerasingness one thing must be 
added: first, one should talk of constants, not only variables which occur on 
the right hand side. However, for noncomhinatorial grammars this is obviously 
unnecessary. 

Simple LMGs are not necessarily analytic, but if they are, they are uniformly 
analytic. For example, a clause of the form 

A{x) B{x) C{x) 

can occur in a simple LMG, but if we read the rules as modes this is unacceptable. 
The condition of analyticity requires instead that each variable occurs to the left 
hand side at least as often as it occurs on the right hand side. Furthermore, |S] 
disallows a variable to occur more than once to the left, but shows that this 
condition can be circumvented. We shall therefore say that an LMG is analytic 
if it is (a) noncomhinatorial and (b) every variable occurs on the left hand side 
of a production at least as often as it occurs on the right hand side. Languages 
generated by analytic LMGs are also generated by simple LMGs but the converse 
need not hold. Notice that our first example of an LMG is actually analytic. 
However, it is not simple. Yet, it is easily transformed into a simple LMG (see 
also P): 

S{xy) -)> 5(a;) S{y) E{x,y); S'(a) -)■ . 

B(a,a) -)> . ; B(xiyi,X2y2) B(xi,X2) B(yi,y2) 

We notice that in Example 4, assuming that we work with pairs of strings 
rather than strings, under a suitable reformulation we only need to assume the 
existence of a binary string function of the following form: 
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p{x,y) 



X a X = y 
t else 



(Here, f means that the function is undefined.) 

We shall illustrate the potential of this proposal by giving an analysis of 
two benchmark examples. One is the cross-serial dependencies. The other are 
the languages with iterated cases. A particular example of the latter kind was 
analyzed in 0, where it was shown that the language generated in this example 
was not semilinear, hence not generable by a LCFRS (or a Multi-Component 
TAG, for that matter). (For a reference on LCFRSs see m-) 

Example 5. The following LMG generates the cross-serial dependencies of 
Dutch. 

VR{zien) — >■ 

Fi?(laten) — >■ 

AP(Jan) — >■ 

AP(Piet) — >■ 

V (zwemmen) —>■ 

VC{xi ■ yi,X 2 ■ y 2 ) NP{xi) VR{yi) VC{x 2 ,y 2 ) 

VC{x,y) -)> V{y) NP{x) 

It is straightforward to transform this LMG into a sign grammar. We basically 

need in addition to the nullary modes, a binary mode • and a ternary mode o. 
The binary mode is defined as follows. 



{{xi,X 2 ),NP,f) • ((j/1,1/2), v,g) := {{xi ■ X2,yi ■ 2/2), VC,f{g)) 



o operates as follows. 



o{{{xi,X2),NP,f),{{yi,y2), V, g) , {{zi , Z2) , VC, 7 )) 
;= ((zi ■ xi- X2,Z2-yi- 2/2), VC,g{f){V)) 



The semantics of a raising verb, here zien (English ‘to see’) is as follows: 

Xx.XP.Xy.see' {y, T(x)) 



This generates the cross-serial dependencies. If we want to give an analysis of the 
German verb cluster or of the analogous English construction, we only have to 
modify the string polynomial Og, nothing else needs to be changed. The analysis 
of cross-serial dependencies is similar to that of Calcagno @, where a larger 
fragment is analyzed using head-wrapping, which was introduced in Pollard 
pnj . Calcagno also discusses the relationship with a proposal by Moortgat |2|, 
who uses string equations. String equations are one way to try to avoid the use 
of vectorization. It is therefore worthwhile to see why it does not work. Suppose 
we have the following rule 

A{u ■ x,v ■ y) — >■ B{u, v) C{x, y) 

Then this must be replaced in Moortgat ’s system by the following rule: 

A{p) — B{q) C{r) : p = u ■ x ■ v ■ y,q = u ■ v,r = x ■ y. 
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This means that if an A is analyzed as a, B plus a C then the strings associated 
with A, B and C should satisfy the string equations shown to the right. The 
trouble with string equations, as Calcagno rightly points out, is that they come 
down to an a posteriori analysis of a single string into several components, which 
may or may not reflect the actual composition of the string. String equations 
are simply not analytic (in our sense of the word) . Head-grammars and LCFRSs 
on the other hand mark explicit places for inserting strings, thus reflecting the 
actual construction of the string rather than just any other. Notice that LMGs 
allow both for string equations and for vectorization, but string equations are 
not allowed in an analytic or a simple LMG. 

Example 6. We Anally discuss the stacked genitives of Old Georgian (see 
0). Old Georgian displays a phenomenon called Suffixaufnahme or Double Case. 
Suffixaufnahme is found in many Australian languages, and it is said to be 
iterable beyond limitation (at least in some languages with respect to certain 
constructions, to be exact, see m for examples). Old Georgian provides a 
generic case of Suffixaufnahme that is iterable (see Boeder ID)- 

govel-i igi sisxl-i saxl-isa-j m-is 
all-NOM Art-NOM blood-NOM house-GEN-NOM Art-GEN 
Saul-is-isa-j 
Saul-GEN-GEN-NOM 
All the blood of the house of Saul 

We will give a compositional LMG for this construction. However, we will sim- 
plify the matter. Full NPs shall have no article, the nominative is always -j and 
the genitive does not appear in its short form -is. The grammar manipulates 
triples (x,y,z), where x is the first noun, y its stack of case suffixes and z the 
remaining NP. First we write the plain LMG. (Recall here that we distinguish 
plain concatenation ('') from word concatenation (•).) 



(1) NPc{x,y,e) 




NP{x,e,e) Case{e,y,e) 


(2) Gase(£, isa, e) 


-)■ 




(3) Case{e,j,s) 


-)■ 




(4) NPs{x, y, z) 




NPc{x,y,z) 


(5) NPs{x,y^ ±sa.,e) 




NPs{x, y, s) 


(6) NPs{x,y''j,e) 




NPs{x, y, s) 


(7) NPs{xi,y2,X2^±sa^y2 ■ Z 2 ) 




NPs{xi,y 2 ,e) APs(x 2 , isa"'?/ 2 , ^^ 2 ) 


Notice that the suffix sequence in 


the last rule occurs twice on the left and 



twice on the right hand side. This grammar is analytic (modulo massaging away 
unproductive rules). The semantic functions are as follows. For (4), (5) and (6) 
the corresponding function is the identity. For (1) and (7) it is application of the 
right argument to the left; (3) has the interpretation AT. IP (T of type (e,<)), but 
this is done for the purpose of exposition only. (2) has the semantics 

AT.AQ.belong'(?/, x) A T(x) A Q{y) 

For the nouns we take Ax.blood^(x), Ax.house^(x) and Xx.x = sauK as interpre- 
tations. They have type NP. Our target example is 
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sisxlj saxlisaj Saulisaisaj 
Here are the translations of the nouns with stacked genitives: 
sisxlj : Ax.blood^(a;) 

saxlisaj : AQ.belong'(y, x) A house'(a;) A Q(y) 

Saulisaisaj : AQ.belong'(y, x) A x = saul' A Q{y) 

Notice that only the inner layer of case marking is semantically operative. The 
outer layers do not change the meaning. Now we compose these words. By the 
rules of the grammar, only the last two words can be analyzed as a constituent: 

AQ.belong'(?/, x) Ax = saul' A belong'(?/, z) A house'(a;) A Q(z)) 

This can now be composed with the first word; this gives the following translation 
for our example. 

belong'(?/, x) Ax = saul' A belong'(z, y) A house'(y) A blood'(z) 

For this to work properly, substitution must be defined correctly. That we have 
free variables here is just an artifact of the simplicity of the fragment and could 
of course be avoided. 

The last example also demonstrates how languages with stacked case mark- 
ing can be treated. However, we shall note here that LMGs cannot handle such 
languages if they have completely free word order. It has been confirmed by 
Alan Dench (p. c.) that those languages which have the most extensive iterated 
case marking system do not allow for free word order beyond the clause bound- 
ary. Given that free word order within a clause can in principle be accounted 
for compositionally using LMGs — as we believe — this gives evidence that 
LMGs have enough string handling capacity. To show this is however beyond 
the scope of this paper. We shall only note that languages with stacked cases 
cannot simply be treated using a function that compares strings of cases, since 
the exponents of cases may actually be different. This means that the string 
handling component must in these examples rely on rather delicate functions, 
which are however computable in linear time. IE] has argued that German allows 
scrambling across any number of clause boundaries. If that is so, German could 
also not be handled by an interpreted LMG compositionally (in the strict sense) . 
There is however every reason to believe that the arguments cannot follow in any 
order whatsoever. Rather, free word order is only present when the arguments 
are sufficiently distinguishable either morphologically (by their case endings) or 
semantically (animate vs inanimate). Otherwise, we claim, word order is fixed. 
If we are right, also German is not so exceptional after all. 
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Abstract. In this paper, we try to put a bridge in between Categorial 
Grammar and Minimalist Grammars as they result from works which 
follow the Chomskyan enterprise (cf Stabler (| 3 ), Cornell ( 0 ))- We show 
that weak and strong features can be replaced by special modalities which 
are used to control resource management. 



1 Introduction 

There is a growing interest in comparing the Minimalist Program (|2|) and the 
Categorial Grammar ( 0 . 0 )- It has became particularly stronger since Stabler’s 
formalization of minimalist ideas under the label of ’’Minimalist Grammars” 
(0), and since other works like those of T. Cornell (|3|), D. Heylen (0) and W. 
Vermaat (to appear). We are trying in this paper to develop a viewpoint based 
on a categorial type logic as proposed by M. Moortgat, N. Kurtonina, and D. 
Oehrle (0), on these minimalist grammars. It can be distinguished from other 
attempts by many respects. In contrast with a conception we have proposed 
elsewhere (0) which uses proof-nets as a representational device for expressing 
derivations, we are here using a purely ’’derivationalist” approach. In contrast 
with Cornell’s approach (0), which is also strictly derivationalist, we don’t start 
with an enumeration (in the Chomskyan sense) of resources trying to build by 
their means a correct sentence, but like in orthodox categorial grammars, we 
start from a sequent as a goal to demonstrate, the antecedent of which consists 
of a parenthesized phonological form. 

Therefore, in some sense, we are developing a reverse conception with regards 
to the generative aspect of minimalist grammars. But of course, that belongs 
to the nature of categorial grammars themselves, as opposed to generative ones. 
Another aspect of our proposal concerns the treatment of the numerous func- 
tional categories Chomsky introduces in his representation of a sentence, which 
include AGRSP, AGROP, NEGP, TP, CP etc. Like it has been pointed out by 
some researchers (0), these nodes have not always a status equivalent to usual 
categories like NP (or DP), S, PP or AP. In fact, functional nodes often appear 
to provide mere targets for move^. It is the reason why we assume that such 

^ Bouchard ( 0 ) speaks of the proliferation of categories that function strictly as escape 
hatches, or landing sites, for moved elements (such as AGR) 

M. Moortgat (Ed.): LACL’98, LNAI 2014, pp. 143-^^^ 2001. 
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nodes are replaced by composition modes, in the sense of jHj and Moreover, 
if moves are always explained by the need for checking features, we shall assume 
that such features are expressed by modalities (cf 0) and are checked under 
their appropriate composition mode. 

This brings us another distinction with regards to j2|, where features are dealt 
with exactly like categories. We can summarize by saying that every deduction 
in the calculus is oriented towards a goal which is a sequent of the form T h F 
where F is a structured multiset of lexical resources and where F is a formulaQ. 
At the beginning, lexical resources are mere words that are replaced by more or 
less complex formulae in the course of the deduction, by means of a [lex]-rule. 
The structure itself is not known in advance and must be guessed. Because we 
want to stay close to the categorial tradition, which always starts from a set 
of types associated with words in order to reduce the sequence they form to a 
base-category, we are led to describe the reverse transformations w.r.t. the move 
operations of the minimalist grammars. 

More explicitly, we try to study how constituants are moved from their overt 
position to the position from where they originate. These positions are simply 
those where they can be cancelled by the usual [/L] and [\L]-rules. 

Of course, when reading the deductions in the top-down direction, we find back 
the usual directions for moves, that is a leftward orientation, towards higher 
positions in the P-marker. 

The Multimodal calculus we present here is not a translation of Stabler’s Mini- 
malist Grammars in multimodal terms, even if it is supposed to have a similar 
generative power. Our main objective is to remain in the spirit of the Minimal- 
ist Program by avoiding as many spurious devices (like empty elements, empty 
nodes or empty types) as possible. 

2 Resource Control 

2.1 Modes and Modalities 

Let us simply recall that on a set of grammatical resources, we can define sev- 
eral binary products and their residuals, and several unary operators and their 
residuals. For a non commutative binary product, the residuation law simply 
expresses that: 

A* B C ^ CfB ^B^ Aye 

For each unary operator O, we can also canonically define a dual one □ such 
that: 

OA^ B ^ A^UB 

We shall use a calculus with three products: o, • and *. • is neither commutative 
nor associative, its adjuncts will be the usual / and \. o is not strictly speaking 
commutative nor associative either, but interaction postulates between the two 
products will give us access to commutativity and associativity. Its adjuncts will 
be denoted /° and \°, they will occur in the lifted types associated with some 

or a possible set of formulae, like we shall see later on. 
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lexical entries. The product * will have also residuals, denoted by /* and \*, 
mainly used for semantic purposes. 

The key point will be that a deduction has to switch from an occurrence of • 
to an occurrence of o each time some move (ie some tree restructuring) has to 
be performed, and this will be done each time a strong modality gives access 
to o. So, only strong modalities will be responsible for overt moves. Moreover, 
the occurrence of o responsible for a move is lowered at each application of the 
corresponding structural postulate and replaced by an occurrence of • at the 
upper position. We shall see later on the use of *. 

In the Dosen-style axiomatic presentation of resource-conscious logics, we shall 
assume the following postulates: 

For strong modalities: 



of (A *5) ^ (OfA)oB 


[KIS] 


Of(AoB) (O^A)oB 


[Kl] 


of(AoB) AoO^B 


[K2] 



For weak modalities: 

0^{A*B) {OcA)»B 
^c{A o B) ^ (^cdl) o B 
<i>c{A • B) ^ A» OcB 
Oc{A o B) ^ Ao OcB 



Communication between products: 

A o B ^ A • B [Incl] 

Ao{B»C)^{AoB)»C [MA] 

Ao{B»C) ^ B»{AoC) [MC] 

A o B ^ B • A [Comm] 



if A and B are product-free 

Comments: 

— A strong composition mode Of gives license to a constituent to move. That 
means that when such a composition mode meets the *-product, it is changed 
into the o-product. Because a strong mode attracts the corresponding feature 
to a ’’specifier” position in the structure, we assume that this feature is in the 
highest left position in the tree under the root affected by this composition 
mode. That amounts to distribute the mode only over the first conjunct. 

— Of course, a strong mode can affect a o-product: that will happen every time 
another strong feature has already occured and the constituent affected by 
the dual modal has not (yet) moved. In this case, the second strong mode 
is distributed either over the first conjunct (in case the constituent which 
has not moved is also provided with the modality corresponding to this new 
mode) or over the second one (in order to be transmitted later on to its first 
conjunct). 



[KIW] 

[KIW] 

[K2W] 

[K2W] 
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— A weak mode does not give any license to move and therefore if it meets a 
{•, o}-product, the sort of product is kept. The second case occurs when a 
Oc has to ’’jump” over a strong feature which has not yet moved. 

— The [Incl]-rule makes it possible to go back to • at any moment. 

— The [MA]-rule ensures that associativity can be performed only by exchang- 
ing the two products: the main one must switch from o to • and the subor- 
dinate one from • to o. In transformational terms, that will correspond to 
the lowering of A and its adjunction to B on its left. 

— The [MC]-rule expresses a kind of mixed commutativity and in transforma- 
tional terms to the lowering of A and its permutation with B. 

— The [Comm]-rule is commutativity under the condition of changing o into • 
and restricted to product-free formulae. This rule is intended to be used at 
the end of a sequence of permutation steps of the [MC]-kind. 

At the beginning of the analysis, we have a tree-structured multiset of lexical re- 
sources: we assume the comma ” ,” be the structural counterpart of the *-product. 
We also assume the following inclusion between strong and weak modalities: 

of A OcA 

This allows the following cancellation to be performed: 

OfOcA A 

This means that a strong modality Of can cancel a weak modality Dc, but not 
the other way round. 

Apart from this postulates package, we recall here for memory, the usual rules 
we use for introducing each binary connective depending on a mode i, and each 
unary connective relative to a modality {feature) a in the sequent calculus for- 
mat. 

Rules: 

O^B r[A]hC OhB r[A]^C 

r[{A/,B,0)^]h r[{e,B\Ay\h 



{r,BY^A {B,ry^ A 

rv AfiB ^ rv B\A 




r[{A,By\^c 

r[A •^B]^C 



[L*i] 



T h A A h B 
(T, Ay h A B 



Oh A r[A] h B 
r[o] h B 



[cut] 



A h A[axiom] 
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r[A] h B 
r[{a^A)^^] h B 



[La^] 



in^c, ^ B 
r h a^A 



[i?Da] 



h B 

B[<>aA] h B 



[LOc] 



Bh A 



[ROc] 



Every postulate F ^ G may be set in the 
rule: 



FhC 

A\~C 



sequent format and then becomes a 



where A is the structural counterpart of F and F the structural counterpart of 
G. 



2.2 Lexicon and Goals 

Lexical entries. They are made of categorial types using / and \(and later on 
/*) \*) /° and \°). and modalities representing their features. Examples: 

aime ::= ^agrV^infi{{np\s) /np) Paul ::= □agriv(m,s, 3 )°fcnp 



The interpretation for Oj^np is: a np waiting for a case. 



Goals. The sequents we want to demonstrate have antecedents made of paren- 
thesized strings of words and a consequent consisting of a unique modalized 
formula, like for instance: ^nom^infG’ expressing the fact that we want to 
reduce the structured resources in the antecedent to a category s which needs 
an inflection and the assignment of a nominative case. 

Of course, there can be situations with no accusative case, and others with more 
cases. In fact, for any arbitrary sentence, there are a priori several modalized 
types to which it can be reduced, but among them only one which is the true 
type, this depending on the type of the main verb, the presence or absence of 
negation and so on. But these possibilities are very limited, they form a set, like 
for instance (for an affirmative sentence): 






nom^acA 



„ 



n® n n® « n® n® qI 



From the parsing viewpoint, we can consider that a first phase consists in select- 
ing the correct final type in the consequent simply by scanning the modalities 
which occur in the lexical types, in the antecedent. 

Moreover, given a set of types S, we shall write F \- S if and only if there exists 
some t such that t€ 5 and T h t. 

Because the [□ R]-rule is such that, in order to prove 



{Xi, {X2, ■■.,Xn)) b Ds 



we have to prove 



((xi, (X2, ...,X„)))^ h S 
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this amounts to assume that the modalities '^nom 

fected to the product on the left thus replacing the former categories IP and 
AGRP. Thus for instance, the grammaticality of the sentence Peter loves Mary 
is established by proving the sequent: 

{Peter, {loves, Mary)) h 

This amounts to successively prove the sequents: 

{{Peter, {loves, Mary)))'^^^^ h D^ccO^nfis 
{{Okup, {loves, Mary)))^f,^^ h nacc^infis 
{{Okup)^f,^^, {loves. Mar y))° h a^ccOinfis 
{np, {loves, Mary))° h a^cc^infis 
{np, {loves, Mary)) h DaccDm/iS 

{{np, {loves, Mary)))“^ acc b ^infis 
{np, {{loves, M ary))^ acc) b Dm/iS 
{np, {loves, {M ar y)^ acc)) b Dm/iS 
{np, {loves, (□fcnp)^occ)) b □*„//« 



until we reach the sequent: 



{np, {{np\s)/np,np)) b s 
which obviously succeeds by [/L] and [\ L]. 

The first step consists in using the [□ R]-rule: the first mode which is used is 
then the Of^^^-mode. The [lex]-rule is used at the second step in order to replace 
the word Peter by its lexical type, asking for a case. At the third step, ^nom 
’’opens” the *-product by transforming it into the o one, and it is distributed by 
[KIS] over the first conjunct. At the fourth step, the feature case is cancelled, 
by the [□ L]-rule. At the fifth step, the [Incl]-rule allows us to go back to the 
•-product. Notice that in fact the subject np has not moved: this is the option 
that will reveal to be correct, according to the type assigned to the verb. (We 
are not making useless moves). At the sixth step, a new cycle is beginning, 
with a new application of the [□ R]-rule: the second mode is the O^cc one. By 
means of several applications of [KIW] and [K2W], the Oocc-mode reaches the 
leaf labelled with the lexical type associated with Mary. At this moment still 
remains a new mode to be performed: the C'infi-mode. It will be used in order 
to ’’free” the verbal type. Finally, we get a sequent easily provable in a rough 
categorial grammar. 

This history of moves can be represented as a series of tree-restructurings of the 
P-marker associated with the sentence. The following figure shows some of them. 
The last tree is obtained after several steps from the previous one. 
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ary 






In the next sections, we shall sometimes present these restructurings instead 
of the entire derivations by using the list representations of trees, for sake of 
brievety. 



Minimality conditions. As often stated in the minimalist framework, mini- 
mality conditions are required to explain that phrases are moving towards their 
nearest target. Here, we shall assume the deductions the shortest as possible, 
counting their length by the number of applications of a structural rule ([Comm], 
[MC] or [MA]) they use. 



3 Examples 

3.1 Interrogatives 

Let us take the example of an interrogative in French: Quel ecrivain Pierre aime, 
with the assignments: 

quel — ecrivain ::= Pierre ::= aime ::= ^infi{{np\s) /np) 

(We omit here agreement features for simplicity) 

The goal sequent is: 

{{quel, ecrivain), {Pierre, aime)) h O^h^nom^acc^fnfiS 



and a fragment of the proof is: 
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{{Pierre, {aime, Dknp))) 



os 



h Da 



[KIS] 

[□i?] 



[Com7 

[MC] 



{Pierre, {aime, Okup)) h 
{Pierre, {aknp,aime)°) h 
(□fcnp, {Pierre, aime))° h 
{{0^ohOknp)'^th, {Pierre, aime))° h 

((□„^Dfcnp, {Pierre, aime)))'^%h ^ 

{{{quel, ecrivain), {Pierre, aime)))^%u ^ O^om^accOf^fiS 
{{quel, ecrivain), {Pierre, aime)) h □f,^n®o^DaccOf„/iS 



[□L] 
[AIS] 
[lex] 
[OR] 



let us remark that the mode serves as the former functional category CP 
and that the first conjunct of the product now corresponds to the specifier of 
the CP. 



3.2 Clitics 

In some languages (like romance ones) some lexical items like pronouns have 
strong features whereas other ones that could replace them (say the full nps in 
this case) have weak features. This shows up in morphology: the full nps are not 
case-marked whereas the pronouns are (cf in French: il, le, lui ...). These items 
will occur in the lexicon as affected by nf . Of course, the can be cancelled 
only by a corresponding O'®. So, in such a case, the sequent is proved only if the 
base-category s contains in its list of modals the corresponding O®, thus enforc- 
ing displacements. Unformally stated, the sentence (Pierre, (le, (lui, donne))) 
will have to reduce to O^om^acc^Lt^fn j:iS, in the following transformational 
steps: 



{Pierre, {le, {lui, donne))) — >[xis]+[dz,] {Pierre, {le, {lui, donne)))° 
->-[if 2 ]+[/fis]+[nL] {Pierre, {le, {lui, donne))°)° 

-^[K 2 ]+[Kis\+[uL] {Pierre, {le, {lui, donne)°)°)° 

-^[Comm] {Pierre, {le, {donne, lui))°)° 

~^\MA] {Pierre, {{le, donne)° , lui))° 

-^[Comm] {Pierre, {{donne, le),lui))° 

-^[inci] {Pierre, {{dcmne, le), lui)) 

The last structure succeeds to reduce if donne::= {{npnom\s) / npdat) / npacc- Let 
us add that in order to get the arguments in the right places in structures, we 
may use for nps disjunctive types like: 

^nom’^Pnom V tHacc^Pacc V ^dat'^Pdat 

or ’existential’ types like 

where k is a variable on {nom, acc, dat}, with the arguments of verbs appropri- 
ately labelled. 
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3.3 Adverbials 

A topic much adressed by the minimalist litterature concerns the difference in 
placement of the adverbials between French and English. We argue that this 
is explained by the strong modality of^fi in French, rather than a correspond- 
ing weak modality in English. Because of that, the structure (Pierre, (aime, 
(tendrement, Marie))) according to the sequence of modalities to check is trans- 
formed into the final sequence: (Pierre, ((tendrement, aime), Marie)) in order 
to succeed by [/L] and [\ L]. In English, the structure must be initially (Peter, 
((tenderly, loves), Mary))) in order to be reduced to s. 

Let us assume the following assignments: 

tendrement ::= V jV aime ::= □m/;((np„om\s)/npacc) 

where V is a meta-variable for any type of verb. 

The proof for the french example is the following: 



{npnom, {{V/V, {np\s)/np),npacc)) b 5 
{npn om •) {{V/V, {np\s)/np),npacc))° b s 

{npn om 5 (^(^(^np\s') / np V / V') j^/^acc)) 5 

{npn om ■) ijnp\s)/np, {V/V,npacc))°)° b s 
{npnom, {{aime)^fnfi, (tendrement, npacc))°)° b s 
{nPnom, {{aime, {tendrement, npacc)))^infi)° b « 
{{npnom, {aime, {tendrement, npacc)))°)^fnfi b s 
{npnom, {aime, {tendrement, npacc)))° b ofn/jS 



[Incl] 

[C omm] 
[MA] 



[lex + UL] 
[KIS] 

[K2] 

pR] 



{{npnom, {aime, (tendrement, marie)))°) 



o\0 



h □ 



infl"- 



{npnomj{ci'^Tne^ {tendrement, marie))y h 
{{^knppnom,{ai'me, {tendrement, marie)))° b 
{{pierrep^g^,{aime, {tendrement, mar ie)))° b 



[□i?] 



pL] 



{{pierre, {aime, {tendrement, marie)))) 



OS 

nom 



b ^acc^inflS 



{pierre, {aime, {tendrement, marie))) b 



[KIS] 

[□i?] 



We can notice here that the [MA]-rule is used in order to adjoin the adverbial to 
its verb, and [Comm] in order to put it in the correct function-argument order. 
The french adverb tendrement cannot occur before the verb just because when 
transferring the strong mode to the left, after the use of the [K2]-rule, the 

mode could only go to the first conjunct which would be in this case the adverb, 
and the feature inf I, beared by the verb, could not be cancelled. Of course, there 
could be an adverbial s/s. In this case, only the [MC]-rule would be needed. 
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4 Curry-Howard Semantics 

4.1 Overt Movement 

One of the biggest advantages of the type-logical approach is its ability to de- 
scribe by a very simple and elegant tool how to build the logical form of a 
sentence. This tool is the Curry-Howard isomorphism. It can be applied here 
modulo some minor changes in our views concerning deductions. Our first deriva- 
tion, associated with the interrogative french sentence Quel ecrivain Pierre aime 
is perhaps not the best we can do if we want to take semantics into account. 
It would be convenient to associate the interrogative-np {Quel ecrivain) with a 
A-expression like: 



\u.WHICH{x,writer{x) f\u{x)) 

In this case, if the expression Pierre aime can be associated with the semantics 

\y .likes{Pierre, y) 

we shall be able to obtain, by a mere application of the first function to this 
argument: 



[Xu.W RICH {x, writer {x) A u{x))]Xy.likes{Pierre,y) 

— > WHICH{x, writer{x) A likes{Pierre, cc)) 

This requires two things we have not yet used: that a np be a functor and that 
hypothetical reasoning be used in order to give a logical form to the s missing 
an np: Pierre aime. 

The strategy for that is to use a lifted type for the interrogative np and to show 
that the rest of the sentence is of the expected type under the hypothesis that we 
have an np. We know that to prove the sequent: A/*{B\* A) * P \- A amounts 
to prove:T h B\* A, which amounts to prove: B * P \- A. By starting from the 
first sequent, we make sure that A/*(B\*A) is applied to P, thus providing a 
functional interpretation of quantifiers and interrogatives, and then, we prove 
the syntactic correctness of the rest of the sentence by means of the simpler type 
B. This strategy can be here applied fruitfully, giving a categorial account of 
overt move in MGs: in the reverse option we have adopted, in an overt move, 
the semantic interpretation of the moved constituent stays in place, when the 
phonological one (represented by the lower type B) gets back to its original place. 
From the MG point of view, that means that the whole constituent (phonological 
features -I- semantic features) moves up, in order for the semantical interpretation 
to get its right place. 

Let us make therefore the following assignment: 

quel - ecrivain ■.■.=^wh^nomS /° {npnom\° s) V ^wh^accS /° {npacc\° s) or 

^wh^ks/°{npk\°s) 

and we get the following fragment of derivation: 
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{sl°{npacc\°s), (npnom, aimc)) h s 
{Oaccs/°{npacc\°s),{npnom,aime)°)° \- OaccOfn/iS 
(HaccS/ i^Tipacc\ s) , ( (Cfc npfc ) ) 1“ ^acc^inflS 

{Oaccs/°{npacc\°s), {{Pierre, aime))'^^a^)° h OaacOf^fiS 
{{Oaccs/°{npacc\°s), {Pierre, aime))°) 



□L 

KlS + lex 



o\OS 



QaccCi 



nom ~ '-^acc'-^infl 



\K2] 

pR] 



{Oaccs/°{npacc\°s),{Pierre,aime)y h □„o^Daccnf„/(S 

{{0„,hOacc{s/°{npacc\°s))ytuyPi<irre,aime))° h a^^^^OaccOf^fiS 

os 



{{OwhOaccs/°{npacc\°s), {Pierre, aime)))\,h 0„omDaccD,„/iS 



{{{quel, ecrivain), {Pierre, aime))y in ^ '^Lm^accOf^fiS 
{{quel, ecrivain), {Pierre, aime)) h 



PL] 

[KIS] 

[lex] 



[□i?l 



The success of the derivation depends now on the proof of the sequent: 
{npnom, {npnom\s) / npacc) I" npacc\° s which is established by: 



{np 

nom ■) {{np nom \s)/np acc ; TT'Pacc^') l~ S 

{np 

nom 1 {np 

acc 1 {np 

nom \s)/np acc r)^s 

{np 

acc 1 {np 

nom 1 {np 

nom \s)/npacc))° \- S 
{npnom, {npnom\s) / npacc) b npacc\° S 



[Comm] 

[MC] 

IVR] 



Remark 1: In the first part of the deduction, we could suspect the nominative 
case goes to the interrogative np quel ecrivain, and that, correllatively, the 
accusative case goes to the subject Pierre, thus resulting in a wrong interpreta- 
tion of the sentence. In fact, this is not possible: if it was the case, the product 
Pierre • aime would remain a *-product, and when the following modality 
(which is here the strong inflection) would be transmitted to this product, by 
[KIS], it would be distributed over the np instead of being distributed over 
the s like it can be the case when the product is o, therefore the modality 
^infi could not be deleted, thus resulting in a failure. The only solution is that 
^nom goes to the np Pierre and the Oacc to the extracted np which is already 
combined with the rest of the sentence by a o (after the cancellation of 



Remark 2: It is particularly interesting to notice that we get an analysis even for 
non-peripheral extractions (contrarilly to the ordinary Lambek calculus) like for 
the sentence quel livre Pierre etudie aujourd’hui? where we assume aujourd’hui 
has the type s\s. The deduction leads us from the bottom sequent: 

{{Quel, livre), {Pierre, {etudie, aujourd' hui))) h O^h^^om^accOfnfis 



to: 



{s/°{npacc\°s), {np nom 5 {{np 

nom \s)/np 

acc 5 s\s)T)°^s 



and then to: 
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[np 

nom 1 {{np 

nom \s)/npa cc; ^\'^) ) fT'pacc\ S 

for which we have the following deduction: 



{{npn om : {{npn om \s)/npa 

CC; ^Pacc) ) ; ^ ^ 

{^{j^Pnom-, ij^Pacc^ ij^Pnorn\^') / ^Pac(^ ),5\s) \~ S 
{{npa CCf {npn om 1 {npn om \s)/npa 

cc) ) ; ^ ^ 

{npa CCi {{npn om 1 {npn om \s)/npa 

cc r.s\s)r^s 

{nPa cc; {nPn om ; {{nPn om \s)/np acci ) ) l~ 5 

{npn om j {{npn om \s)/npa CC1 s\s)°)° I- npacc\°S 



[Comm] 

[MC] 

[MA] 

[MA] 

lYR] 



4.2 Covert Movement 

The situations concerned by covert movement are all those which have a not mov- 
ing (phonological part of a) constituent that has scope over the entire sentence. 
The archetype of this situation is provided by in situ binding, a phenomenon 
which has been intensively investigated in the past in the categorial framework, 
and particularly by M. Moortgat ( 0 ). We are taking here most part of the solu- 
tion brought by him. This solution consists in introducing our third product: *. 
Constituents concerned by covert movement are assigned a lifted type made with 
/*, \* and a special modality Owhich makes communication possible with the 
^-product. We assume new communication rules, which are symmetrical with 
respect to those of o with regards to •. 

Introduction of*: 



OA — ^ A * t [* I] 

Introduction of O: 

Aot^OA [OI] 

Communication postulates: 

A * B ^ A o B [*o] 

A»{B*C) {A^B)*C [MA’]i 

{A*B)*C ^ A*{B*C) [MA’]2 

A»{B*C) ^ B*{A»C) [MG’] 

We give to any quantified expression the possibility of having the type 

nfcO(s/*(Dnpfe\*s)) 



Let us see the following example, associated with the sentence Pierre a lu tous 
les livres, where we assume tous les livres gets this type. A fragment of the 
deduction (after the removal of and the instanciation k = acc) is: 
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{npnom, {{nPnom\s)/nPacc,t)) h □npgee \*5 5 h S 

(s/*(Dnp 

acc \*s), {npn om 5 {{npn om \s)/npa 

CCi 

{npn om ; (s/*(Dnp 

acc \*s), {{npn om \s)/npa 

CCi t)nti 

{npn om 1 {(npn om \s)/npa 

CCi (s/*(Dnp 

acc vs),tr))h. 

{npnom,iinpnom\s)/npacc,<>is/*{anps,cc\*s)))) h S 
{Pierre, {aJu, (tousJesJivres))) h 



[/*L] 

[MC'] 

[MC] 

[*/] 

[OR] 



That was the first part of the deduction: the quantified expression gets its right 
position in order to get its scope. In the second part of it, the right rule for \* 
is used just before the communication rule [*, o] in order to get a deduction 
similar to overt movements: Onpacc gets back to its place, and its modalisation 
allows the cancellation of t by means of the usual law ODA — >■ A and the reverse 
postulate of [* I], which is [OI]. We get: 



{np„ om 1 {{npn om \s)/npa 

CCi npacc)) b S 

{nPnom,{{nPnom\s)/npacc,0'^npacc)) b S 

{npnom, {{nPnom\s) / npacc, (Dnpgcc, t)°)) h S ^ 

i'^Pnom: {^“^Pacci {{'^Pnom\^{ /‘^PaccA^^ ) ^ 

{Onpacc,{nPno7n,{{'n^Pno7n\s)/npacc,t)))° \- S 

[*oJ 

{Onpacc, {nPnoTn, {{'n^Pno7n\s) /npacct)))* \~ S 

{nPnom, {{nPnom\s) /npaccA)) I" ^npacc\* S 

Let us notice that the case of a subject quantified expression is solved by the 
use of the [MA’] 2 -rule. We have the following steps: 



{0{s/*{anpno7n\*s)),{V,np)) {{{s/*{anpno7n\*s)),t)*,{V,np)) -)> 

{s/*{nnpnom\*s),{t,{V,np))y -)> {anpnom,{t,{V,np))y 
{OnpnoTn,{t,{V,np))y -)> {{anpnom,ty,{V,np)) 

{<>Onpn 

om 1 {V,np)) {npn om 1 {V,np)) 

5 Other Examples 

5.1 English Relativization 

We give here some aspects of a description of relatives in English. We assume 
the following assignment: 

met ::= Ci„fi{{npnom\s) /npacc) Paul ::= □fewpi, 

that, who ::= {n\n)/Of^npk\°S the ::= {Oknpk)/n 

The nominal phrases the man that Paul met and the man who met Paul receive 
the following derivations. 
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the man that Paul met. 
Bottom of the deduction: 



{n,n\n)\-n (Dinp;, nom \s)/np acc )) I- □fcnpfc\°5 

(n, ((n\n)/aknpk\°S, (Ompi, □i„/i((np nom \s)/np acc ))))hn 

□cupc h O^npc {man, {that, {Paul, met))) h n 

{the, {man, {that, {Paul, met)))) h O^npc 



[\i] 

[lex] 



Middle of the deduction: 



{p^kUPk, {npnom, ^infl{{np nom \s)/np 

acc 






{Dkupk, {{Ompi) 



os 



norm ^infl{{npn 



r\s)/npacc))°)° h 



(□fcupfc, {{Ompi,Uinfi{{np nom \s)/np 

acc))) noTTT.) ^ ^ 



nfl 



{{Okupk, {ampi,ai„fi{{np nom \s)/np acc W) 



OS 



I— n n* 

nom ' ^ in f I 



{Oknpk, {Oinpi ; ^infl {{np 

nom \s)/np 

acc )))° h 



{Oknpk, {Dmpi, □i„/i((nj5 nom \s)/np acc ))r^s 
{Dmpi, □i„/i((np„ om \s)/np 

acc )) \- □fcnpfc \°5 



pR] 

- PL] 
[Kl] 
[K2] 

pR] 

[select] 






It is worthwhile to notice here that the correct analysis is provided. We could 
expect the mode ^nom be affected to npk- But this would result in a failure, 
because in this case, the mode Oacc would necessarily be compelled to go down 
to the root of the subtree {Oinpi, V), which is a *-node. The accusative modality 
would be checked, but without opening this *-product, and no structural postu- 
late could be applied in order to get the correct configuration for cancelling the 
slashes. That ^nom be affected to npi is therefore the only solution. 
the man who met Paul: 

Middle of the deduction: 



• — O 



{np nom t {{np nom \s)/np acc ; Dinpi))° h □ accS 
{{aknpk)‘^Lm,{{np nom \s)/np acci Dinpi))° h □ accS 
((Qfcnpfc, {{np nom \s)/np 

acc 1 I^/'^PO)) nom ^accS 

(□fcnpfc, {{np nom \s)/np acc 1 Ompi)) \- S 
{{np nom \s)/np 

acct Dinpi) h □fcnpfc\°5 



PL] 

[KIS] 

[□ 7 ?] 



The nominative modality directly goes to Oj.npk, thus giving the nominative 
case to the missing element of the relative. There is no difficulty afterwards for 
attributing the accusative case to an element which is already at its convenient 
place, and then, cancellation of slashes can be performed in the straightforward 
manner. 



5.2 VSO Languages and Subject- Verb Inversion 

It is easy to show that this framework takes SOV-languages into account: this 
just amounts to have a strong accusative modality. But VSO languages necessi- 
tate a more fine-grained analysis of features, splitting the mode into two 
separate modes: one for tense and the other for agreement. The role which was 
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played in our previous analysis by inf I will now be played by agr, and the fea- 
ture tense will be put ahead in the series of modalities which affects a goal, 
in such a way that a tensed sentence will reduce to In a 

VSO language, Dtps is strong. This can be applied to the case of sentences with 
inversion like they occur in French (for instance: je demande quand viendront 
les beaux joursfl ask when will come the beautiful days)). These sentences are 
analized by means of a ’’subsequence” of modes which compells the 

wh-constituent and the verbal head to move up. Of course, this is a facultative 
typing, the other possibility is which gives only je demande quand les 

beaux jours viendront. 

Actually, we can assume in this framework several choices of (sub) sequences of 
modalities. These sequences of modes are predetermined and associated with 
particular languages. They are such that some modalities (like D^^j) are ’’stuck” 
to other ones (for instance Fl^^). These restrictions can be formulated apart and 
they resemble the well known feature co-occurrence restrictions in GPSG-style. 



6 Conclusion 

This paper proposes a solution to the problem of expressing some thesis of the 
Minimalist program in the Multimodal Gategorial Framework. Very clearly, we 
can draw a correspondance between the two frames. 



Minimalist Program 


Multimodal Categorial Framework 


categorial features 


types 


formal features 


modalities 


categorial features checking 


residuation laws for slashes 


formal features checking 


oaA A 


moves 


tree-restructuring by postulates 


overt moves 


{•-o}-communication 


covert moves 


{ *-o}-communication 


logical forms 


A-terms representing proofs 



But there are obviously some differences that we propose to view as advantages 
of MGF. If the rational goal of MP is to dispense with as many devices as we can, 
we may pretend to be in its spirit if we show how to dispense with empty elements 
like traces and with empty nodes. In a very rough analyse of a sentence like Peter 
loves Mary, the generative conception would posit a trace of the subject in the 
first argument position of the VP-shell, resulting in a structure like: 

Peter I t\ loves Mary 

represented by a tree with an empty node. But this is useless because for the 
semantic interpretation, we only need the tree structure: 



{Peter, {loves, Mary)) 





158 



A. Lecomte 



Even if moves change the respective positions of the constituents with regards 
to their semantic interpretation, traces are devices which can be dispensed with 
because the A-term which is built up by the Curry-Howard isomorphism encodes 
the story of the transformations, like it can be particularly shown in the case 
of covert moves. More generally, in the Generative Framework, traces have been 
introduced in order to encode the story of the transformational derivation into 
the surface structure obtained, but in MCE, building up A-terms already does 
that job, thus making traces useless. 

A point where we depart from Stabler’s MG concerns empty types. We call 
empty types those types which are associated with no string. For instance, MGs 
admit so called ’phonetically empty lexical items’ which would correspond in 
MGF to uninhabited types. We think that it is more in the spirit of Gategorial 
Grammar to dispense with those types, and this also realizes some conceptual 
economy to get rid of them. 

Of course a trivial parsing algorithm which would be based on a blind proof 
search in the research space would be particularly inefficient, but it seems easy to 
foresee heuristics which could help the research, for instance by early establishing 
a one-to-one correspondance between the modalities in the goal and those in 
the antecedent. Future work will be devoted to a proof-net approach of these 
problems, which could notably improve the search. 
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Abstract. In this paper we give a formal description of the parsing 
model that underlies the treatment of Long Distance Dependencies, 
Topic and Focus, Ellipsis and Quantification in, amongst others, the 
papers In this model, a natural language string consists 

of a sequence of ‘instructions packages’ to construct some term in a for- 
mal representation language, the logical form of the string in question. 
Parsing, then, is the process of executing these packages in a left to right 
order. 



Introduction 

In this paper we will give a formal description of the parsing model that under- 
lies the treatment of Long Distance Dependencies, Topic and Focus, Ellipsis and 
Quantification in, amongst others, the papers 121 , 0 , 0 , 131 , 0 . Although the in- 
tuition behind the model is quite natural, nevertheless, it seems not to have been 
explored to any extent. The main idea is to view the natural language string as 
consisting of a sequence of ‘instructions packages’ to construct some term in a 
formal representation language, this term being the supposed interpretation or 
logical form of the string in question. Parsing, then, is the process of executing 
these packages in a left to right order. At all intermediate steps in the parse 
process, one will have a possibly incomplete specification of (some part of) a 
logical form. Incompleteness may arise in various ways: one may have a com- 
pleted term which is a subterm of a logical form yet to be finished, one may have 
some subterm but not know yet how it is to fit within the term currently under 
construction, one may have an incomplete specification for a subterm with an 
accompanying constraint on its completion, and so on. All these possibilities are 
discussed in the papers mentioned above. The purpose of this paper is formalise 
the concepts of partiality and information growth required. 

In our view, parsing can be seen as establishing an association (s, D) between 
a natural language string s and a set of logical formulas D, the possible inter- 
pretations of s. In our model, such a set D of logical forms is constructed by 
working left to right through the sequence of instruction s, 

PARSE{s) = ((s(l), Di), . . . (s(n), D„)), 

M. Moortgat (Ed.): LACL’98, LNAI 2014, pp. 159-EB 2001. 

@ Springer- Verlag Berlin Heidelberg 2001 



160 W. Meyer-Viol 



[o[o Fo(John)] [i[i Fo(Aa:A 3 /read(x)(y))] [i [o Fo(*, booka;)) ][i Fo(AP(someP)) ]]] 
Fig. 1. A Term as a labelled tree 



where s(n) is the last element of s, each Di is a finite set of partial logical forms, 
Di is the result of processing s(l) in a starting context Dq and, provided s is 
grammatical, is a set of complete logical forms. In our model, the partial 
logical forms in Di, will be represented as the points or states of some partially 
ordered structure where the partial order < represents development or incremen- 
tal growth of tree structure: for each Di-i in PARSE(s) and partial logical form 
T G Di-i there is a form T' such that T <T' and T' G Di. 

1 Terms as Decorated Trees 

In order to handle partiality of logical forms in a flexible way, we represent the 
elements of each Di as decorated finite partial trees. Logical forms are built up 
by one or more ways of putting together basic semantic entities. Each of these 
modes of combination can be associated with a product type-constructor. A term 
in a language appropriate for these type-constructors can be represented as a 
finite binary branching tree, where every binary branching (&i,&2) reflects the 
presence of a subterm 0 {bi, 62) for some operator O. As an example, let APL be 
the operation of function application in a typed lambda calculus. The sentence 
John read a book, represented by the formula read(John, some(a:,book(a;))), 
can be seen as resulting from the unreduced lambda term 

APL(APL(AxAt/read(?/)(a:), APL(AP(someP), (cc, booka;)), John) 

by / 3 -reduction. In Figure E we have represented this term as a decorated binary 
tree in the form of a bracketed formula. Here ‘[o’ means argument- and ‘[1’ 
function-daughter, and the predicate ‘Fo’ (for formula) holds for (sub)terms 
of our meaning representation language (in contrast to, for instance, the ‘Ty’ 
predicate that will hold for type expressions). 

Such decorated tree structures have to be constructed in the course of a parse 
through an NL string. In order to deal with the partial logical forms arising 
during a parse we consider T-structures, 

Definition 1 (T-Structures) A T -structure is a quintuple of the form T = 
(T, ^1; where T is a non-empty domain of tree nodes and Ai,for 

i G I = { 0 , 1,4., *}, is a set of (possibly empty) binary relations on T. 

The set BT consists of those T-structures which are ordered as binary trees, 
where Aq is the argument-daughter and the function-daughter relation, ^4, 
is the immediate dominance relation (-lj,=^o U ^1), and is the dominance 
relation, i.e., the reflexive and transitive closure of 
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[o[o Fo(John)] [* [o Fo((a;, hooka;))], [i Fo(AP(someP))] ]...] 



Fig. 2. A Partial Term as a partial labelled tree 



Definition 2 (Partial Trees) A function / is a Tr -morphism from T- 
structure T in T-structure T' if it maps T in T' such that for all n, m C T, 
for alH G /: n m => /(n) -<i f{m). The set PT, of partial trees, consists of 
all T-structures T such that there is a Tr- morphism mapping P to an element 
of BT, i.e., a binary tree. 

Notice that in a full-blown binary tree, n m implies that there is a sequence 
of immediate dominance steps relating n to m, but in a partial tree this does 
not need to be the case. The under-specified tree relations and in particular 
-<*, will play an essential role in constructing the logical form while traversing 
the string in a left to right fashion: we cannot always decide on the spot where 
a certain subterm has to function in the eventual term. Given the string A book 
John read, we do not yet know what to do with A book at the start of the sentence. 
After having parsed A book John a possible partial logical form constructed is 
shown in Figure El which gives a partial tree model with an under-specified tree 
relation. This under-specified relation constrains the completions to those binary 
trees which have this relation witnessed by an immediate dominance sequence. 

In a set-up where partial trees are constructed in stages, we need a pointer to 
identify the nodes at which action is to take place, that is, our basic units have 
to be representations of the form {P, n) , a partial tree P together with a pointer 
indicating some node n G T, which we will write asPn. 

Definition 3 (Structure of Pointed Partial Trees) Let PPT = {Pn \ 

P G PT, n G T} be the set of pointed partial trees. For i G I we set Pn P'n' 
if T = P' and n -<i n', and Pn < P',n' if there is a Tr-morphism f ■. T ^ T' 
such that /(n) = n' . The frame of Pointed Partial Tree structures can now be 
defined as 

PPP = 

Along <, a pair n,m G T such that n m may be mapped by Tr-morphism / 
to a pair such that f{n) /(w) and later, by some Tr-morphism g, to a fully 
specified relation g{f(n)) ff(/(w)). 

In the general model we also use a relation PPT x PPT where T n -<j^ P'n' 
implies that TCP' = 0 and n' is a top node of P' . Thus, this relation connects two 
disjoint trees where an arbitrary node of the first tree is Linked to the top node 
of the second tree. These disjoint, but connected, trees will represent linguistic 
“islands” in our model. This relation is used in the example of Figure 0 
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1.1 The Language DU 

On these pointed tree structures we can interpret the Language of Finite Trees, 
LFT (see [1]), a propositional modal language with the modalities (0)</> (“(/> holds 
on the argument daughter”), (“(/) holds on the function daughter”), (l)(j) 
{“cj) holds on some daughter”), {*)(j) {“(j) holds here or somewhere below”), {L)cj) 
{“(j) holds on a linked node”), their converses (*~^), and their universal variants 
[ i ],[ ], for i G {0,1,4-,*}. In the tree of Figure^ for instance the following 

holds 

— (O)Fo(John) and (l)(l)Fo(Aa:Ayread(a;)(y)) at the top node 0. 

— (i“^)(*)-Fo(a;, bookx)) at node 00 decorated by Fo(John). 

where the atomic formulas F'o(John) etc., are designed to describe Declara- 
tive Units decorating binary (linked) tree structures. Declarative units are pairs 
consisting of a sequence of labels followed by a content formula: 



We have seen examples of content formulas in the denotations John, (x, Bookx) 
and AxAj/read(x)(j/). The types e, t and e — ?> t from these examples are in- 
stances of labels. The descriptions of declarative units determine the atomic 
vocabulary. So, our language has monadic predicates Lai, . . . Lun, Fo, stand- 
ing for n label dimensions and a formula dimension and individual constants 
from Dial , • ■ ■ , Dpo respectively, denoting values on these dimensions. The 

atomic propositions of the language then have the form Lai{t) or Fo(t) where 
t is either an element of the appropriate domain Dj^aij Dpo, or it is a meta 
variable. A declarative unit {l\, . . .In) ■ 'L can then be completely represented by 
a description, the finite set of atomic propositions satisfied by that unit. 



And a partial declarative unit, an object naturally arising in the course of a 
parse, is merely a subset of a description of a declarative unit. 

The language, DU, we have settled on to describe partial declarative units and 
their developments towards logical forms includes the tree modalities from LFT, 
the standard Boolean constants and connectives and existential and universal 
quantifiers ranging over the set of label and formula values. 

Definition 4 (The Representation Language DU) A proposition A of the 
language DU has one of the following shapes: 



A ::= T I A I Lai(h) | . . . | La„(Z„) | Fo(</>) | Eg(ti,t 2 ) | A A A | A V A | 

I A ^ A I 3xA I VxA I (#)A | [#]A 

for VAR= {x, xi, X 2 , . . . , y, . . .} a denumerable set of individual variables, MV 




labels Formula 



{Lai(li), . . . Lan{ln) , Fo{<F)} , 
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a denumerable set of meta variables, predicate values k G -Dia, U VAR U MV, 
for each i: 1 < i < n, cj) a, logical form in Dpo U Var U MV, t\,t 2 G Dpai U 
Dpo U VAR U MV and # a modality i or for i G {0, 1,4-, *}• The quantifier 
variables are rendered in boldface to distinguish them from the variable bound 
by quantifiers in the domain Dpo, i-e., variables occurring in the logical forms 
under construction. 



As is standard, this language is interpreted over Pointed Partial Trees by means 
of Valuation functions V. 



Definition 5 (Pointed Partial Tree Models) A Pointed Partial Tree Model 
M is a pair M = {WT, V) consisting of a Pointed Partial Tree Structure WT 
and a valuation V assigning finite sets of atomic formulas to elements Tn G PPT 
and satisfying the following principle 

Tn < T'n V{Tn) C V{T'n). 

This principle guarantees that once an atomic proposition has been established at 
some node in a partial tree, this proposition will remain to hold there throughout 
all future developments of that tree. These pointed partial tree will now be used 
to represent (unreduced) lambda terms as in Figure ^ 

Definition 6 (Truth Definition for DU) Given a Pointed Partial Tree 
Model M = {WT ,V,), a set V = ^Fo of label and formula 

values, a set MV of meta variables, t\,t 2 G T>\J MV we say that pointed tree 
Tn G PPT of M satisfies formula (j), with the notation 

Tn \=M 4>j 



if 



(j) is atomic and 4> G V{Tn) 
(f = T 



4> = Eq{ti,t 2 ) and U = t 2 

4> = Ip Ax and Tn\=ipSzTn\=x 

4> = Ip V X and T\=ip or Tn\=x 

(p = Ip ^ X and for all T'n' : Tn < T'n' , 

if T'n' \=M pJ then T'n' \=m X 
(p = 3icip and there is at GT> :Tn \=m tPlt/x] 

(p = Myiip and for all T'n' : Tn < T'n' and all t gT> 

T'n' \=M 



if i G {0, 1,4,, *} and 

(p = {i)ip and 3T'n' G PPT \Tn<i T'n' and T'n' \=m ip 

p = {i~^)ip and 3T'n' G PPT : T'n' -<iTn and T'n' \=m P 

p = [i]^ and for all T'n' :Tn < T'n' and all T"n" G PPT 

if T'n' T"n" then T"n" h P 
p = [i~pp and for all T'n' : Tn < T'n' and all T"n" G PPT 
if T"n" T'n' then T"n" h P 
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So no decorated node w satisfies _L, T holds on every node, the atomic formulas 
hold according to the valuation function. The Boolean connectives and the 
modal operators have their standard interpretation. As usual, we can introduce 
negation by the following definition. 



~'(l> =df </>—>■ -L. 

So, holds at a pointed tree T n ii 4> does not hold there nor at any T'n' 
such that Tn < T'n'. The operators and modalities with universal force 
(‘ — ‘V', ‘[#]') quantify not only over (nodes of the) current (partial) decorated 
trees, but over possible developments of the current structure. For instance, 
the top node of the current decorated partial tree need not be the root node of 
the eventual tree. Given that we have a falsum T (satisfied by no node) and 
verum T (satisfied by all nodes) we can ‘close off’ the top node by expressing 
facts like Tn \=m -L meaning “at all mother nodes of Tn the formula T 
holds”, i.e., Tn has no mother node (and will have none). This is an operation 
that can take place on a tree the moment all words of the NL string have been 
processed: the node that happens to be the top one at that moment is turned 
into a root node. At the other end of the tree we can declare bottom nodes to 
be terminal nodes by annotating them with [i]T. This is a task of the lexical 
entries associated with the words: a word closes off a branch downwards. 

By definition we have persistence of atomic Z?t/-formulas. By the form of the 
Truth definition, this can be lifted to the whole of DU. So, if <() is a Z?C/-formula, 
Tn \=M (l> and Tn < T'n', then T'n' \=m 4>- 

It may be illuminating to view some typical interactions between the tree 
modalities and connectives, like implication, with universal force. In a model 
M we can have Tn \=m {*)(!> without there being a sequence Tn ^^ . . . ^^Tn' 
such that T n' \=m 4>- In a partial tree the relation between two nodes entails 
only that the path between them can always be completed to a fully specified 
one: by definition, a partial tree can always be Tr- morphically embedded in a 
binary tree (where is the real immediate dominance relation and its real 
refiexive and transitive closure). On Pointed Partial Tree Models the basic logic 
of finite trees holds ‘under double negation’. By definition, every Pointed Partial 
Tree can be extended in M to a full-blown binary tree. Such a tree satisfies all 
the principles of the Logic of Finite Trees as listed in PJ. Consequently, if (p is 
an LFT theorem, i.e., \=lft then Tn \=m for every model M. 

The meta variables can be distinguished from the proper values by the fact 
that only for a proper value li we have the satisfaction of dxLa^ (x) . 3'xtjj holds 
at a node if ip[t/^] holds there for some t G D and a meta variable U is not an 
element of V. We want the existential quantifier to be able express that some 
label or feature predicate has a value. And a meta variable, or the ‘undefined’ 
symbol just won’t do as a proper value. In the next paragraph we will see some 
uses of this fact. 

Apart from the LFT principles we will have to introduce axioms regulating the 
behaviour of the Fo and Ty predicates on the trees. For instance, a node may 
be annotated by at most one type. This requires a principle of the form 
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VxVy((Ty(x) A Ty{y)) Eq{x, y)). 

Furthermore, the Fo and Ty values at the daughters of some node have to 
be related to the values on those predicates at the node itself; for the type 
predicate, for instance, this is forced by the principle 

VxVy(((0)Ty(x) A (l)Ty(x y)) Ty{y)). 



2 Goal-Directedness 

We still have to add one refinement to the pointed partial trees. Within the 
domain PPT of a partial tree model M = {WT, V), we identify the set LoFo 
consisting of the decorated partial trees that correspond to (unreduced) terms 
of the typed lambda calculus. Grammatical strings, the words of which project 
actions mapping one pointed partial tree to a next one, must lead us into this 
subset of PPT. In all elements of LoFo, the root node, for instance, will be 
annotated by a lambda term of type t. Thus, in any partial stage, that root 
node will have a requirement that it be annotated by a lambda-term of type t, 
another node that it be annotated by a term of type e, and so on. This use of 
requirements on the development of a tree node has some resemblance to the 
familiar concept of ‘sub-categorization’, as a node decorated by the labelled for- 
mula Ty{e — >■ (e — >■ t))), F’o(read) within a tree may have a mother node which 
is decorated with a requirement (l)Ty(e) (that is, a requirement for an internal 
argument for ‘read’). However, the concept of requirement is much more general 
than sub-categorization statements: all nodes are introduced with requirements. 
Requirements form an essential feature of a partial tree as they determine a set 
of ‘successful’ extensions, namely those in which all requirements are satisfied. 
Consequently, our basic data structures are tree structures, the nodes of which 
consist of (partial) declarative units paired with finite sets of requirements. 

To model these requirements we add a requirement function R to the model 
assigning a finite number of (arbitrary) Df/-formulas to elements Pn. These for- 
mulas represent a finite number of requirements on that node (as opposed to the 
facts at that node assigned by V). Figure 0 represents a decorated partial tree 
resulting from having parsed John read. This tree includes a node with a require- 
ment for an object of type e rendered by 7Ty{e) (this node-plus-requirement is 
introduced, “sub-categorized for”, by the verb read). 

Definition 7 (Models with Requirements) A Pointed Partial Tree Model 
with requirements is a triple M = (M,R), where M = {PPT,V) is a Pointed 
Partial Tree Model and i? is a function assigning finite sets of DU formulas to 
elements of PPT and satisfying the following constraint: 

Tn < T'n' => R{Tn) C {Th{T'n') U R{T'n')). 

Here the theory Th{Tn) of a node Tn is given by Tn = {(() G DU \ Tn \=m 
T). Notice that, unlike the valuation function, the requirement function is not 
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restricted to atomic propositions; we are free to require any (finite number of) 
DU-formula(s) at some tree node. 

Along the growth relation < requirements may disappear, but only by becoming 
facts. For instance, we have 

[a ?Ty(e) ] < [a Fo{(l)),Ty{e),?Ty{e) ] < Fo{(l)),Ty{e) ]. 

But also, 

[a ?3xFo(x),?Ty(e) ] < [a Fo{John),Ty{e) ]. 

The successful developments, that is, the developments in which all requirements 
are satisfied, of a pointed partial tree Tn we collect in the set LoFo{Tn) of 
(supposed) Log ical Forms into which Fn can develop. 

LoFo{Tn) = {T'n' G PPT \ Tn < T'n' : R{T'n') = 0}. 

In fact, the set LoFo represents an ‘internal’ version of the set of logical forms 
we are after. The object is now to make LoFo and the set of real logical forms 
coincide. Below we will discuss this more extensively. 

Having introduced the concept of a requirement over Pointed Partial Tree Models 
we will exploit it by introducing some constants to the language DU which 
address the status of the requirements. 

Definition 8 (Truth Definition for Requirements) Given a Pointed Par- 
tial Tree Model with Requirements, M = (M,R), we say that pointed tree Tn 
of M satisfies formula (f>, with the notation 

^ Hat 



if 

4> G DU and Tn \=m 

a,nd Ip G R{Tn) 

(j>=l% and i?(Tn) = 0 

(j> = Ftp and 3T'n' G LoFo{Tn) : T'n' \=m 

(P=Gp; and VT'n' G LoFo{Tn) : T'n' P’ 

A requirement 7(p holds if the formula <p occurs on the the requirement list of that 
node, and ?0 is a constant which holds at a node if it has an empty requirement 
list. Proposition Fp) holds at node Tn if there is at least one development of 
Tn to (a node in ) a logical form where p) holds, and Gpj holds at node Tn if ^ 
holds at all developments of Tn to logical forms. 

A tree node Tn has requirements that can be satisfied iff FT holds at Tn. This 
allows us to define a conditional: (p -Ggr tp =d (FT A (p) ^ p). This conditional 
expresses invariances shared only by successful developments. The principles of 
binary trees we express using — >■ (for instance, (1)T — >■ (0)T, “if there is a 
function daughter then there is an argument daughter” , after all, all PPT’s can 
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[o [o Fo(John) ], [i[o ? Ty(e)], [i Fo(Aa:Ayread(x)(y))]]] 

Fig. 3. Partial tree with requirements 

map to binary trees, by definition). But, for instance, /3-reduction need only 
function in trees with successors in LoFo. Here we use the axiom schema 

(l)V’) ~^gr 



2.1 The Uses of Requirements 

Consider all terms of the typed lambda calculus we use to formulate our 
logical forms, the representations of the formal meanings of the NL strings 
under consideration. The elements of this set are our target structures the NL 
strings have to construct. We represent these terms as trees by their applicative 
structure (that is, a subterm f/') is represented as a binary branching point 
with Xx(j) annotating the function - and ip the argument daughter). At the nodes 
of these representations we hang empty sets. These empty sets represent empty 
sets of requirements. This we now turn into the set LoFo of some Pointed 
Partial Tree Model Af, by adding all partial versions that can be extended to 
elements of the LoFo. Now, in order to guarantee that the set LoFo, defined in 
terms of fulfilled requirements, and the set of representations of terms in our 
typed lambda calculus coincide, every abstraction of such a term to a partial 
object has to be compensated by the introduction of a requirement. Starting, 
for instance, from the term read(John, (a, x, bookx)) we can create a partial 
term by abstraction over John. This abstraction is not a term of our typed 
lambda calculus, so it should not belong to LoFo. By definition, this means 
that there must be some unfulfilled requirement associated with it. But this 
does not have to be a requirement for exactly ‘John’. It may (and, in fact, 
will) be merely a requirement for type e. So here is where the invariances of the 
process come in. 

Given a specific feature of the logical forms we are interested in, the first 
question is now always: can we devise a system of requirement introduction 
such that the fulfillment of all requirements annotating a given tree corre- 
sponds exactly to a completing this tree to a term of our typed lambda calculus 
with the desired feature?. We will give three examples of our use of this principle. 



Binary trees: as a first example, we will consider a requirement strategy to the 
effect that the set LoFo coincides with the set of a binary tree structures. That 
is, no under-specified tree relations of the form or are left in LoFo that 
are not the reflexive transitive closure of the immediate dominance relation or 
the union of argument and function daughter relation, respectively. The idea is 
to introduce a Tree node label, a monadic predicate ‘Tn’ with values in Dtu = 
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{a ■ x,x \ a G A,x G {0,1, L}*}. That is, a value of this predicate is a finite 
sequence of elements of (0, l,L} possibly preceded by a constant from a set A. 
When trees are constructed in the parsing process, in general it is not known 
whether a description that starts off as a top node will remain so (and thus 
be the root node of the eventual tree). This is why we introduce a new node 
invariably with ‘address” a G A, satisfying a formula Tn{a), where a is a constant 
not yet occurring in the construction. In interaction with the tree modalities 
various constellations are expressible. So, given the formula Tn(a), expressing 
the location of a node in the tree under construction, we can fix 

Tn(aO) O (O-i)Tn(a), 

Tn(al) O (l-i)Tn(a)0 

and we can fix the root node of a tree as follows, 

Tn(0) O [{-i]T. 

(Significantly, we do not ‘internalize’ the under-specified modalities {*) and ({) 
as values of the Tn predicate.) The Tree node formula Tn{0) holds at a node if 
it is a top node and remains so throughout all developments. (Note the use of 
the ‘falsum’ - “At every node above the current one T holds.” As T is satisfied 
by no node at all (Definition Ej) this means that there are no nodes above the 
current one.) 

Now when we introduce a tree node with an under-specified relation to the source 
node, as we do in Figure 0 we add the requirement for 3xTn(x) to the node 
with under-specified address: 

[a'!Ty{t)],a => [a ?Ty(f), [, ?T?/(e), ?3xTn(x)]], a* 

The point is that this requirement can only be satisfied when the node it dec- 
orates is merged with one that has a fully specified relation to the top node of 
the tree: only in that case will there be a t £ Dtu such that Tn{t) holds at 
the node. So, if, starting from the Axiom, we introduce tree nodes with under- 
specified tree only if they are accompanied by requirements for values on the Tn 
predicate, then in all elements of LoFo that we can reach, all underspecification 
with respect to tree relations will have been resolved. 

Pronouns: an analogous mechanism allows us to introduce meta variables or 
placeholder values with the requirement that they be substituted in order for the 
term to be complete. 

[a?Ty(e)],a ^ [a Fo(U), Ty(e), ?3xFo(x) ], a 

An object of type e has been supplied, but a new, weaker, requirement has taken 
its place. Only a development that supplies a concrete value for the meta variable 
U in F'o(U) can end up in LoFo. 



^ We will also need Tn{aL) {L ^)Tn{a) when we consider Linked Trees. 
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Resumptive pronouns: On the other hand, in the example of FigureElwe will 
introduce the tree node 

[*?Fo(a,a::, hooka;)) ] 

which contains a requirement that can only be fulfilled by a concrete value for 
the Fo predicate. Such a value, a copy of a value already introduced, can, in our 
model, not derive from the sentence string except by means of a pronoun (as a 
‘copy instruction’). In parsing the string “A book, John read i(\ the pronominal 
it can copy any accessible value: this is the contribution of the pronoun. But only 
if it copies the value from the head book’ will the requirement for the exact 
formula value that has been introduced be fulfilled: this is the contribution of 
the goal-directed framework. A pronoun in such a situation is a resumptive one 
in our model. 

3 The Parsing Process 

The object of the parsing process is to construct a binary tree structure the top 
node of which is decorated by a formula of type t while using all information in 
an NL string. The minimal element in the model A4, the starting point of every 
parse0 is the Axiom 

Axiom: [a ?Ty{t)],a. 

The Axiom is the Pointed Partial Tree Model consisting of a single node (thus 
the location of the pointer), the putative top node, with empty valuation 
function and requirement for an object of type t. 

In the course of a parse starting from the Axiom a tree is created. A parse ends 
essentially after the last word of an NL string has been processed. A successful 
parse of a grammatical NL string must result in a logical form, that is, a Pointed 
Binary Tree in the set LoFo of M.. So, it must end with a tree where all nodes 
are associated with an empty list of requirements. Moreover, if a node Tn is a 
terminal node in 7”, then Tn \=_m [i]T. Thus we have a tree that can no longer 
be extended with descriptions of new nodes. So, a Goal has the form 

Goal [a, . . . ,T?/(t) ], [aO. . . ], [al- • ■ ] ■ ■ -],a GLoFo. 

This is a tree all nodes of which have empty requirement lists. If the procedure 
that leads from the Axiom to an element of LoFo is sound, then the final tree 
represents — is isomorphic to — an unreduced term in the language of our 
logical forms. When an NL string is finished, the words have performed their 
task, then, if it is grammatical, a binary tree structure has been constructed the 
terminal nodes of which are decorated and have no unsatisfied requirements. 
Now, this (representation of an) unreduced lambda term may be normalized in 
ways depending on a variety of labels collected on the tree during the parse. 



^ of an NL string in isolation, i.e., without a context. 
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By parsing a grammatical NL string the Axiom is connected to a Goal by a 
finite sequence of Pointed Partial Trees . Each tree Ti+i in this sequence is a 
development of the previous tree 71, i.e., % < Ti+i- Figure 0 shows the struc- 
tures arising in the course of parsing “John read a booK\ Notice that each new 
structure is a development under the < relation. The progress from Axiom to 



[a lTy{t)],a Axiom 

< [a [o ?Tj/(e)],[i ?Ti/(e^t)]],aO 7 

< [a [o ,Eo(John),Ti/(e)], [i ]],al John 

< [a [o ],[i [i Eo(Aa:Ayread(x)(i/)),rt/(e (e -)■ t))], [o ?ry(e)]]], alO read 

< [a[o],[i[i ], [o Eo((a, a;,bookr),Tj/(e)]]],alO a book 

< [a [o [i Ao(Ayread)(a,x,bookr)(j/),Tj/(e -5- t), [i ], [o ]]],al i 

< [a read(a, r,booka;)(John),rj/(t), [o ], [i [i ][o ]]],a € LoFo 

Fig. 4. Pointed Partial Tree Models arising in the course of parsing “John read a 
book.” At every transition only the new aspects are highlighted. 



Goal is non-deterministic: at every state of the parse the word currently under 
consideration can generally be assigned more than one structural role in the tree. 
That is, the path from Axiom to Goal is one through a space of alternative parse 
courses. For instance, an NP heading an NL string may end up as subject, John 
read a book, it may be a fronted object, A book John read, or it may be a topic 
constituent, (As for) a book John read it. The last two possibilities are worked 
out in Figures 0 and El respectively. 

Notice that parse of “John read a book" and “A book John read’ end with the 
same logical form, but this form is reached starting from the Axiom along dif- 
ferent routes. Thus the Axiom must be able to access a (finite!) number of al- 
ternative partial tree descriptions to accommodate the first NP of the sentence. 



[a1Ty{t)],a Axiom 

<[alTy{t),[^ lTy{e)\],a* 7 

< [a lTy(t),[t (a, a;, books;), Ty(e)]], a a book 

< [. [o ?Ty(e)],[i ],[* ]],a0 7 

< [a [o ,Fo(John),Tt/(e)], [i ?Tj/(e t)], [* ]],al John 

< U [o ],[i [i Fo{XxXyread{x){y)),Ty{e ^ (e-^t))],[o ?Ty(e)]][. ]],al0read 

<[o[o],[i[i ], [o Fo(a, a;, books;), Ti/(e)]]], alO * — 10 

< [a [o [i Fo(Aj/read)(a, x, books;) (y), Ty{e ^ t),[i], [o ]]], al f 

< [a read(a, s;,booka;)(John),Ti/(t), [o ], [i [i ][o ]]],a G LoFo 



Fig. 5. Pointed Partial Tree Models arising in the course of parsing “A book John 
read.”. At every transition only the new aspects are highlighted. 
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The presence of alternative, and mutually exclusive, continuations is of course 
not restricted to the start of an NL string. 

The move from Axiom to Goal in a parse of an NL string s is driven by actions 
which are either projected by the words supplied by s or by the computational 
‘background’ mechanism. 

3.1 Actions 

An action a is of the form 



a C PPT X PPT, 

that is, actions are relations over PPT. An action a may map a Pointed Partial 
Tree to a number of other such trees. The actions we will introduce are fitted 
to the Pointed Partial Tree Models in that they are incremental in the following 
sense. 

Definition 9 (Incremental Actions) An action a is incremental if for every 
Tn € PPT, if T'n' = a{Tn), then either Tn < T'n' or T = T' ■ 

So an incremental action is either some construction or a pointer movement. The 
actions are based on a set of basic or atomic actions. The basic actions consist of: 
creation of new nodes relative to old ones; decoration of nodes, moving to nodes, 
and substitution at nodes. These action may be combined to give complex ones 
by, essentially, the PDL operations. 

Definition 10 (Actions) For Ai a Pointed Partial Tree Model with Require- 
ments, where # is i or i~^ for i G {0, 1, , j,, *, A}, (j) is either an atomic DU 
formula or of the form lip for arbitrary ip G DU, and U G MV, the set ACT of 
actions contains the following elements: 



Basic Actions 

1. l = {(Tn,rn) \ TnGM}, 

AB = <P. 

These represent the halting action and the Abort action respectively. 

2. make(#) ■. M ^ M. This action creates a node T'n' such T' = T U {n'} 
and that T'n T'n' . 

3. go(#) : M M. Here go(Tn) = {T'n') implies Tn T'n' . 

4. put(0) : M 1 --^ M. Here, put((/>)(Tn) = T'n' implies that T = T' except 
that V{T'n') = V{Tn) U {(p} if (p is atomic and R{T'n') = R{Tn) U {ip} if 
(p =lip. 

5. subst((/>, U) : Ai !->■ Ai. Here subst(^, U)(Tn) = {T'n') implies that T = 
T' except that V{T'n') = V{Tn)[(p/\5]. 

Complex Actions We can put actions together by executing one after the 
other (sequential composition ‘;’) and doing that any finite number of times 
(finite iteration ‘*’), or by indeterministically choosing between them (choice 
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[a?Tj/(l)],a Axiom 

< U ?Tt/(l)], Ul ?Ty(e)],aL 4 

< [a ?T{/(1),[» ?Fo(a, a:, book(®), ]], [ai, (a, xbooka;), Ti/(e)], a a book 

< [. [o ?Ty{e)] , [i ?Tt/(e ^ t)] , [. ]] [aL ] , aO i 

< [a [o ,Fo(John),Tt/(e)], [i ], [* ]], [ai ],al John 



< [a [o ], [i [i Fo{XxXyrea.d{x){y)),Ty{e (e ^ t))j, [o ?Tj/(e)]][. ]], 

[ah ], alO read 

< [a [o ], [i [i Fo{XxXyre&d{x){y)),Ty{e ->■ (e ->■ t))], 

[o -Fo(a, r,booka;)]][* ]], [az, ],alO it 

< U [o ], [i [i Fo{XxXyread{x){y)),Ty{e ->■ (e ->■ 1))], 

[o -F’o(a, Xjbookr)]], [az, ],alO* := 10 

< [a [o [i Fo(Ayread)(a, x, booka;)(y), Ty{e -)■ t), [i ], [o ]]], [ah ]al 4^ 

< [a read(a, o:booka:)(John),rt/(t), [o ], [i [i ][o ]]], Uz, ],a e LoFo 

Fig. 6. Pointed Partial Tree Models arising in the course of parsing “A book, John 
read it.” At every transition only the new aspects are highlighted. 



6. if a, a' are actions, then so are a; a' , a + a', a*. 

Finally, we can put actions together in a conditional IF THEN ELSE statement. 

7. If if is a set of formulas all variables of which occur in x and a, a' are actions 
then (d7(x), a, a') is a conditional action with the definition: 

(r(x),a,a') = 

{{Tn,Vn') G a[t/x] | t G (P U MV)*,T^m ^P/x]}U 

{{Tn,r'n') G a' \ G {VL\MV)*,Tn \=m S[i/5c}. 

That is, if E holds at 'Tn for some substitution of t for variable x, then action a 
is executed where in the body of this action the variable x is also replaced by t. 
If E does nor hold at Tn for any substitution for x, then action a' is executed 0 
For instance we can define actions which transport features from one node (e.g. 
a ‘head’) to another. 

— ({Fo(x)}, go((0)); put(Fo(x)), AH) maps the value of the Fo feature at the 
current node, if there is such a value, to the Fo feature at the left daughter, 
otherwise it aborts. 

We may also define an action that enables pointer movement to the closest 
clausal node (eg for tense suffixes): 

- gof irst(?Tz/(f)) = {{7Ty{t)}, I, go(4-^)))*; {{7Ty{t)}, AB, I). 

The action gof irst(Ai) action moves the pointer upwards to the first higher 
node with annotation X, eg. the requirement Ty(t), and then stops. 

® The PDL tests do not suffice over this propositional logic because we want the ability 
to formulate IF THEN ELSE statements and only in the context of excluded middle 
can these be defined in terms of the PDL test. 
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We have two ‘pure’ classes of actions. Complex constructions (that is, composi- 
tions of actions without the go action) represent complex tree growth. Complex 
movements on the other hand (the family of go((^)) instructions closed under 
the given operations) represent pointer strategies. In general, for instance in 
case of actions projected by words, the actions are a mixture of these classes. 

Along a path in the space of alternative parse courses, the transitions leading 
from one partial tree to the next are of two fundamentally different kinds: they 
are either Computational Transitions or Lexical Transitions. 



3.2 Computational Transitions 

Computational transitions develop the information contained in the current tree, 
they bring out information, but they do not add information. A typical example 
of such a transition is the addition of the formula (0)Ty{e) to a node (description) 
T n in the case there is a node Tn' in the same tree such that Tn -<q 7'n' and 
the formula Ty(e) is an element of V{Tn'): a conclusion is ‘computed’ from the 
information contained in the tree. Figure 0 illustrates this rule. 



[a [i Fo{\x(j)) ], [oF’o(V>)]],al 

< [a {l)Fo{\x(l)), [iFo{Xx(l}) ],[oFo{'ip)]],a 



Fig. 7. Transfer of information up the tree. 



This transition sets the stage for an application of {0)Fo{Xx4>) A (l)Fo(V^) — >■ 
Fo{(I)[iIj/x]). 

Another computational rule is used to merge tree nodes. Figure El illustrates a 
situation arising in IF/i-questions like “what did John read?' . 



[a [i UoTn{t),lTy{e) ], [*Fo(Wh), Tj/(e), ?3xTn(x)]], aO 

< [a [i ], [oTn{t),Fo{Wh),Ty{e)]],aQ 

Fig. 8. Merging of nodes aO and a*. 



The IF/i-element in a question has content of type e but no location in the 
term, a ‘free-floating’ decoration; the gap lacks type e decoration but has a flxed 
position in the term. Merging fulfills requirements for position and content of 
both nodes at once. 

A final example of a computational rule is the introduction of a linked structure 
adjoined to an argument of type e. 

Here, having processed the word ‘'John', the top node of a new (partial) tree is ad- 
joined which requires at some as yet under-specified location the head F’o(John) 
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[a. . . [hFo{John),Ty{e) ],],a - 6 

< [a. . .[bFo{John),Ty{e) ],],[a-bL ^Ty{t), [«?Fo(John)]], a - 6L 

Fig. 9. Adjoining of a linked tree. 



as an argument. This rule sets the stage for a sentence like “John, who read a 
book,. . Here, the complementiser ^who\ as any anaphoric object, copies the 
head into the relative clause i projection. The main clause and the relative clause 
have to share some argument (the ‘head’), this is essential to the Link Construc- 
tion (see 0). 

3.3 Lexical Transitions 

Lexical Transitions, on the other hand, map one tree to a next one adding infor- 
mation in the process. These transitions are projected by the Natural Language 
words. They are defined as conditional actions of the IF THEN ELSE variety. A 
lexical transitions test IF the some finite set of formulas (the condition), holds 
at the node where the pointer is located — this may include modal statements 
about decorations located at nodes related to the pointed one and it may also in- 
volve requirements. If the condition holds there, THEN a (sequence of) action(s) 
is undertaken resulting in a new tree description, ELSE (i.e., the condition does 
not hold) a second action is undertaken, usually an ‘abort’ action. 

The actions projected by a lexical item may carry out a whole range of differ- 
ent constructions. Some items project only an annotation (proper names), or 
only a requirement (case particles), or an annotation plus requirement (tense 
specifications), or, even, only a pointer movement (expletives). 

Definition 11 (Lexical Actions) A Lexical Action w is an action of the form 

w = (A(x), a, a') 

where H is a finite set of D [/-formulas, x is a sequence containing all (free) 
variables occurring in E, and a, a' are complex actions as given by Definition 

irni 

For example, we have John, and read project the following actions 

Jo/m IF {?Tj/(e)} 

THEN put(Eo(John),Ty(e), [i]±) 

ELSE AB 

read IF {?Ty{e^t)} 

THEN make((I));go((l)) 

put(Eo(read)),T?/(e -)> (e -)> t)), [4_]T; 
g°((l”^));put(?(0)(T?/(e))) 

ELSE AB 
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In case of the proper name John the actions are released by the presence of the 
requirement for an object of type e. These actions then consist in annotating 
the node with the content formula John the type label e, and closing it off as 
a terminal node. In the case of the transitive verb ‘read’, the presence at the 
pointed node of a requirement for Ty{e — )> t) releases the actions: creation of a 
function daughter node, annotation of that node with a type and a formula and 
the imposition on the mother node of a requirement for an argument daughter 
of type e. Actions annotating the node with facts are the analogue of lexical 
substitution processes in other frameworks (though it should be remembered 
that it is not lexical items which are inserted in the tree in this framework, 
but their logical correlate). Actions decorating a node with requirements are 
analogous to sub-categorization statements. However all nodes are introduced 
with requirements, so unlike standard sub-categorization statements which are 
imposed on sisters to terminal nodes as a requirement on lexical insertion in a 
tree, in this framework, the use of requirements is much more widespread. 



4 Natural Languages 

Having defined our representations of the logical forms, we now turn to the 
language strings which create these forms. A natural language C is determined 
by a pair of C-words, W c-, and a L- alternative- function ALTc, 



C^{Wc,ALTc). 



— The set Wc of £- words consists of a set of IF A(x) THEN a ELSE a' 
statements, as we have described them. Traditionally the words of a language 
relate to syntactic and semantic construction of the clause merely by projecting 
the decorations of the terminal nodes of some (binary) tree. In our model, on the 
other hand, words are modelled as incremental actions on VVT ■ A word may 
add annotations and requirements at various nodes, it may satisfy requirements, 
add tree structure, unify nodes and draw conclusions from the annotations across 
various nodes. 

— The function ALTc of assigns to a pointed partial tree a set of alternative 
developments, alternative structural analyses which are typical for the language 
C. No successful development of Tn is gained or lost by considering ALTc{Tn)^ 
that is the point of Definition El The alternative-function represents that part 
of a language C that can be most interestingly described as being external to 
the lexicon. 

Consider the sentence initial occurrence of John with three alternative procedural 
analyses (a structural analysis plus a pointer) induced by the Axiom: John as 
subject, John as fronted object, and John as topicalised object. To say that the 
set of these three analyses constitutes an alternative-set for the Axiom means 
first of all that if string s maps one of the alternatives of the axiom into a logical 
form then s is grammatical (the soundness part), but it also means this set 
is complete in the sense that every grammatical string can be accommodated 
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by the alternative set: if string s is grammatical, then s can map at least one 
alternative of the Axiom to a logical form. So we may replace the Axiom by its 
image under the alternative function without losing logical forms. 

The set of the three possible analyses of the sentence initial John is of course 
not complete in this sense, so it is not an alternative-set for the Axiom. But we 
assume that this set can be extended by some finite number of other analyses 
to a complete set of alternatives. Obviously, the finiteness assumption is crucial 
here for theoretical analysis of linguistic structure. At every point in the parse 
of a sentence we can only prepare for a limited variation in structure @ These 
informal insights we will now define more carefully. 

Definition 12 (Alternative Actions) An action a gives alternatives to Tn G 
PPT if for all Vn' G LoFo: 

Tn < T'n' ^ 3T"n" : {Tn,r"n") Gak T"n" < T'n'. 

A function ALT : PPT i— >■ ACT is an alternative function if it assigns to ev- 
ery element Tn ot PPT an action a which gives alternatives to Tn such that 
{Tn^T'n') G ALT{Tn) implies that ALT{T'n') = 1. 

Action a is gives alternatives to Tn if every possible successful development 
T'n' is accessible from some element in \T"n" G PPT \ {Tn,T"n") G a}. 
Now we can consider a (finite) set of actions {a\ . . . a„} to be complete w.r.t. 
Tn \i a\ + ... + On gives alternatives to Tn. An alternative function assigns 
to every Pointer Partial Tree an action giving alternatives, but alternatives are 
only ‘unfolded’ one level: an alternative of an alternative is a fixpoint. 

For purposes of presentation we will abuse this notation and use the formulation 
T'n' G ALT(Tn) to mean that {Tn,T'n') G a, where ALTifTn) = a. In figures 
Eland 0we have seen two images of the action ALT {[a lTy{t)]) on the argument 
[a ?Tj/(t)], namely, 

[a ?Tj/(f), [o ?Ty(e)], [i ?Ty(e -)> t)]],o0 and ?T?/(t), [, ?Ty(e)]], a * . 

That is, the Axiom can be expanded to a subject verb-phrase structure with 
the pointer at the subject, or it may be expanded to a structure expecting an 
‘unfixed’ node annotated by a type e object. A third alternative is given in Figure 
0 Of course, this does not exhaust the alternatives that will be required for a 
full analysis of English. 

If we now reconsider the lexical action projected by the word John, we see that 
it wants a decoration lTy{e) for its actions to be executed. The Axiom itself 
does not have such a decoration, but by the ALT function (at least) two pointed 
partial trees are accessible with the pointer at a node with the right requirement. 

Notice that the analyses are not purely structural for the sentence John reads a book 
and a book John reads eventually develop the same logical form starting from the 
Axiom. The analyses give a structural analysis plus a pointer. The location of this 
pointer determines the acceptability of a lexical item at that point in the parse. 
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The notion of an alternative-set can also have more formal content especially in 
case of structural underspecification. By LFT principles we have the following 
possible ALT function. ALT{\a [j, (j) ]]) has the elements 

[a [o 4> ], [i ]] and [a [o ], [i <t> ]] 

The tree relation of immediate dominance can be developed into either function- 
or argument-daughter. This reflects the LFT principle (|) O ((0)(/) V (l)</<). As 
an other example of a formal alternative set consider ALT{[a [» ^ ]]) which gives 

[a (j) ], and [a [o [*(/']], [i ]], and [a [o ], [i[* «^' ]]]• 

This reflects the LFT principle (*)(/) O ((/>V (!)(*)(/)). 



Finally, a pointed partial tree model with requirement Ai = {WF, V, R) to- 
gether with language £ = {Wc, ALTc) determine a Parsing Model Me in the 
following way. 

Definition 13 (Parsing Models) A Parsing Model Me is a pair 

Me = {M,^c) 

where At is a Pointed Partial Tree Model with Requirements, £ = {We, ALTe) 
is a natural language, and PPT x PPT is the smallest set of C-transitions 
s G W^}, satisfying the definition 

• Tn T'n' for w G We iff there is a T"n" G ALT{Tn) : w{T"n") = T'n' , 

• Tn ^s-s' T'n' iff there is a T"n" : Tn T"n" and T"n" T'n' . 

With a Parsing Model Al£ we can associate the function GRAMe '■ PPT i— >■ 
V{L), which maps a pointed partial tree to the set of Tn- grammatical strings, 
that is, the strings that satisfy all of R{Tn): 

GRAMeiTn) = {s G W^ \ 3T'n' G LoFo{Tn) : Tn T'n'}, 

If we start GRAMe from the Axiom, GRAMe{Axiom), then we get all £- 
strings which produce logical forms starting from the empty context, that is, the 
grammatical £ sentences. We may start however from a more structured context, 
that is, we consider strings from £ that need a context to be grammatical, for 
instance, the string Bill a newspaper in the following example of PP-ellipsis: 
“John reads a book. Bill a newspapeF . Now, “John reads a book” may give the 
starting structure for “Bill a newspaper” 

TaO=[o lTy{t),[o ?Ty(e)],[i [oFo(Aa;Aj/read(a;)( 2 /))[i ?Tj/(e)] ?]],a0 

This is the < lowest upper bound of the structures that result from parsing the 
strings “John reads a booF and “Bill reads a newspapeF . “Abstraction” over 
‘John’ and ‘a book’ creates this structure and Bill a newspaper G GRAMe{TaO). 
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A Parsing Model is specific for a language C. The vocabulary Wc of the 
language determines the ‘instruction packages’ that are present to construct 
transitions from the Axiom to a logical form. It creates a classification of the 
partial forms by the labels referred to in the conditions of the lexical actions. 
The conditions of these actions may be sensitive both to the structure-plus- 
annotations that has been constructed upto that point and to the requirements 
that have not yet been fulfilled. 

The alternative function ALTc gives the possible structural analyses that are 
typical of C but, as such an analysis includes a pointer location, this function is 
also a determining factor in the word order of the strings in GRAMc {Axiom) . An 
extensive study of the variations and invariances across languages that Parsing 
Models can handle can be found in 0. 
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Abstract. The change within the linguistic framework of transforma- 
tional grammar from GB-theory to minimalism brought up a particular 
type of formal grammar, as well. We show that this type of a minimalist 
grammar (MG) constitutes a subclass of mildly context-sensitive gram- 
mars in the sense that for each MG there is a weakly equivalent linear 
context-free rewriting system (LCFRS). Moreover, an infinite hierarchy 
of MGs is established in relation to a hierarchy of LGFRSs. 



1 Introduction 

The change within the linguistic framework of transformational grammar from 
GB-theory to minimalism brought up a new formal grammar type, the type of 
a minimalist grammar (MG) introduced by Stabler (see e.g. |Ed), which is an 
attempt of a rigorous algebraic formalization of the new linguistic perspectives. 
One of the questions that arise from such a definition concerns the weak gen- 
erative power of the corresponding grammar class. Stabler jOj has shown that 
MGs give rise to languages not derivable by any tree adjoining grammar (TAG). 
But he leaves open the “. . . problem to specify how the MG-definable string sets 
compare to previously studied supersets of the TAG language class.” We address 
this issue here by showing that each MG as defined in jOj can be converted into 
a linear context-free rewriting system (LGFRS) which derives the same (string) 
language. In this sense MGs fall into the class of mildly context-sensitive gram- 
mars (MGSGs) rather informally introduced in 0 and described in e.g. j^j. 

The paper is structured as follows. We start by briefly repeating the definition 
of an LGFRS and the language it derives (Sect.|2I). Turning to MGs, we then 
introduce the concept of a relevant expression in order to reduce the closure of 
an MG to such expressions (Sect. 0). Depending on this relevant closure, for 
a given MG we construct an LGFRS in detail and prove both grammars to be 
weakly equivalent (Sect. 01). Finally, an infinite hierarchy of MGs is introduced in 
relation to a hierarchy of LGFRSs. The former is unboundedly increasing, which 
is shown by presenting for each finite number an MG that derives a language 
with counting dependencies in size of this number (Sect. |^. 

* This work has been carried out within the Innovationskolleg ‘Formal Models of Co- 
gnitive Complexity’ (INK 12) funded by the DFG. I especially wish to thank Marcus 
Kracht for inspiring discussions, and Peter Staudacher as well as an anonymous 
referee for a lot of valuable comments on a previous version of this paper. 
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2 Linear Context-Free Rewriting Systems 

In order to keep the paper self-contained, in this section we quickly go through 
a number of definitions, which will be of interest in Sect. 0 again. 

Definition 2.1 (P9). A generalized context-free grammar (GCFG) is a five- 
tuple G = {N,0, F, R, S) for which the conditions (G1)-(G5) hold. 

(Gl) is a finite non-empty set of nonterminal symbols. 

(G2) O C for some finite non-empty set S of terminal symbols 

with A n A = 00 hence O is a set of finite tuples of finite strings in E. 
(G3) F is a finite subset of IJneiN where is the set of partial functions 
from O" to O, i.e. Fq is the set of constants in O. 

(G4) R C UneiN('^ ^ X A”+^ is a finite set of (rewriting) rules^ 

(G5) S' G A is the distinguished start symbol. 

Let G = (A, O, F, R, S) be a GGFG. A rule r = (/, Aq, Ai, . . . , A„) G F„ x A"+i 
is generally written Ag — t /(Ai, . . . , A„), and just Ag — t / in case n = 0. If the 
latter, i.e. if / G O then r is terminating, otherwise r is nonterminating. For 
A G N and A: G IN the set Lq{A) C O is given recursively in the following sense: 

(LI) 0 G Lq(A) for each terminating rule A ^ 0 G R. 

(L2) 0 G Lq’^^(A), if 0 G Lq{A) or if there is A — >• /(Ai, . . . , A„) G R and there 
are 0i G L^(Ai) for 1 < i < n such that 0 = f(0i, . . . , 0n) is defined. 

We say A derives 0 (in G) \i 0 G Lq(A) for some fc G IN. In this case 0 is called 
an A-phrase (in G). The language derivable from A (by G) is the set Lg{A) of 
all A-phrases (in G), i.e. Lg{A) = U/cgin ^ he set L{G) = Lg{S) is the 
generalized context-free language (GGFL) (derivable by G). 

Definition 2.2 (^|). For every m G IN with m 0 an m-multiple context-free 
grammar (m-MGFG) is a GGFG G = (A, O, F, R, S) which satisfies (MI)-(M4). 

(Ml) 0 = Uti(^*)*- 

(M2) For / G F let n(/) G IN be the number of arguments of /, i.e. / G F^^f). 
For each / G F there are r(/) G IN and di{f) G IN for 1 < z < n{f) such 
that / is a (total) function from x . . . x (F*)'^"®(f) to (^E*y^^'i 

for which (fl) and, in addition, the anti-copying condition (f2) hold. 

(fl) Let X = {xij 1 1 < z < n{f),l < j < di(/)} be a set of pairwise distinct 
variables, and let Xi = {xn , . . . , Xid^(f)) for 1 < z < n{f). For 1 < h < r{f) 
let be the h-th component of /, i.e. f{0) = (f^{0 ), . . . , f'~''^\0)) for all 
0 = {01, . . . , 0n{f)) G X ... X {E*y^^^\ Then for each component 

there is an lh{f) G IN such that can be represented by 
^ IN denotes the set of all non-negative integers. For any non-empty set M and n G IN, 
is the set of all n + 1-tuples in M, i.e. the set of all finite strings in M with 
length n + 1. M* is the set of all finite strings in M including the empty string e. 

^ For any two sets M\ and M 2 , Mi x M 2 is the set of all pairs with 1st component in 
Ml and 2nd component in M 2 . 
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(c?i) / (xi, ■ ■ ■ , Xn(f)) — ChOZhlChl ■ ■ ■ Zhli^(f)Chlh{f) 

with (m e r* for 0 < / < lh{f) and zm G X for 1 < I < lh{f)- 

(f2) For each 1 < t < n{f) and 1 < j < di{f) there is at most one 1 < h < r(/) 
and at most one 1 < ^ < lh(f) such that Xij = Zhi, i.e. zm is the only 
occurrence of Xij £ X in all righthand sides of (ci)-(cr(/)). 

(M3) There is a function d from to IN such that, if Aq — >■ f{Ai, . . . , ^n(/)) S R 
then r(/) = d{Ao) and di(/) = d{Ai) for 1 < i < n{f). 

(M4) d{S) = 1 for the start symbol S. 

The language L(G) is an m-multiple context-free language (m-MCFL). 

In case that m = 1 and that each f G F \ Fq is the concatenation function 
from to E* for some n £ IN, G is a context-free grammar (CFG) and 

L(G) a context-free language (CFL) in the usual sense. 



Definition 2.3 (fSj). For m £ IN with m yf 0 an m-MCFG G = {N, O, F, R, S) 
according to Definition 12.21 is an m-linear context-free linear rewriting system 
(m-LCFRS) if for all / £ F the non-erasure condition (f3) holds in addition to 
(fl) and (f2). 

(f3) For each 1 < i < n(/) and 1 < j < di{f) there are 1 < h < r(/) and 
1 < ^ < lh{f) such that Xij = zm, i.e. each Xij £ X has to appear in one 
of the righthand sides of (ci)-(cj.(/)). 

The language L(G) is an m-linear context-free rewriting language (m-LCFRL). 

A grammar is also called an MCFG (LCFRS) if it is an m-MCFG (m- 
LCFRS) for some m £ IN \ {0}. A language is an MCFL (LCFRL) if it is 
derivable by some MCFG (LCFRS) . The class of MCFGs is essentially the same 
as the class of LCFRSs. The latter was first described in |Sj and has been studied 
in some detail in jOj. The “non-erasing property” (f3), motivated by linguistic 
considerations, is omitted in the general MCFG-definition. 0 shows that for 
each m £ IN \ {0} the class of m-MCFLs and that of m-LCFRLs are equal. 
In Sect. 0 we in fact construct an LCFRS that is weakly equivalent to a given 
minimalist grammar. 

3 Minimalist Grammars 

We first give the definition of a minimalist grammar along the lines of 011 Then, 
we introduce a “concept of relevance” being of central importance later on. 

Definition 3.1. A five-tuple r = (AA, <* , <t, La&eG) fulfilling (E1)-(E3) 

is called an expression (over a feature-set F). 



^ Recall that we use e to denote the empty string, whereas 0 uses A. 
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(El) is a finite, binary ordered tree. Nr denotes the non-empty set 

of nodes. <* and -<r denote the usual relations of dominance and precedence 
defined on a subset of Nr x Nr, respectively. I.e. <* is the reflexive and 
transitive closure of <1 t-, the relation of immediate dominance^ 

(E2) <T-C Nr X Nr denotes the asymmetric relation of (immediate) projection 
which holds for any two siblings in {Nr,<%,~<r), he. each node different 
from the root either (immediately) projects over its sibling or vice versa. 
(E3) The function Labelr assigns a string from F* to every leaf of {Nr,<*j^T), 
i.e. a leaf-label is a finite sequence of features from F. 

The set of all expressions over F is denoted by Exp{F). 

Let F be a set of features. Consider r= {Nr ,<* ,~<r ,<t , L abelr) G Exp{F). 

A node x G Nr is a maximal projection, if it is the root of r or if x's sister 
projects over x. Each xGNr has a head h{x)GNr, a leaf such that x<i*h(x), 
and such that each y S Nr on the path from x to h{x) with j/ yf a; projects over 
its sister. The head of t is the head of r’s root Cr- 

T has feature f G F ii r’s head-label starts with /. r is simple (a head) if it 
consists of exactly one node, otherwise r is complex (a non-head) . 

Suppose V and (f G Exp{F) to be subtrees of r with roots and r^, respec- 
tively, such that Cr <r fv, ^ 4 ,- Then we take [<r;, cj] ( [><(), u ] ) to denote r in case 
that r„ <r rtf, and r^ <r r^, {r^, -<r r„). 

Definition 3.2 ([b]). A 4-tuple G = {V, Cat, Lex, F) that obeys (N1)-(N4) is 
called a minimalist grammar (MG). 

(Nl) V = PU/isa finite set of non-syntactic features, where P is a set of 
phonetic features and / is a set of semantic features. 

(N2) Cat is a finite set of syntactic features partitioned into the sets base, select, 
licensees and licensors such that for each (basic) category x G base the 
existence of ^x, "X and X" G select is possible, and for each — x G licensees 
the existence of -l-x and -1-X G licensors. Moreover, the set base contains 
at least the category c. 

(N3) Lex is a finite set of expressions over V U Cat such that for each tree 
T = {Nr, <T, Labelr) G Lex the function Labelr assigns a string 

from select* {licensor sU {e})select* {baseU {e})licensees* P* I* to each leaf 
in {Nr,<r,-<r)- 

(N4) The set T consists of the structure building functions merge and move as 
defined in (me) and (mo), respectively. 

(me) The function merge is a partial mapping from Exp{VUCat) x Exp{VUC at) 
to Exp{V U Cat). A pair of expressions {v, (f) belongs to Dom(merge) if v 

has feat ure "x, ^X or X" and <p has category x for some x G 5ase0 Then, 

Up to an isomorphism Nr is a unique prefix closed and left closed subset of IM*, i.e. 
X G At if xx' £ Nr, and x* £ At if XJ ^ Nr for x, x' ^ IN* and i, j G IN with i < j, 
such that for x,'f & Nr hold: x <It V' iff V' = X* for some i G IN, and x iff X = 
and ip = ujj'ip' for some uj,x ( G IN* and G IN with i < j. 

® For each (partial) mapping / from a set Mi into a set M 2 we take Dom(/) to denote 
the domain of f, the subset of Mi for which / is defined. 
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(me.l) merge{v,(j)) = if v is simple and has feature “x, 

where v' and (j)' are expressions resulting from v and (j), respectively, by 
deleting the feature the respective head-label starts with. 

(me. 2) merge{v, 4>) = \<^v' , </>' ] if u is simple and has feature "X, 

where v' and (j)' are expressions resulting from v and (j), respectively, by 
deleting the feature the respective head label starts with. In addition the 
phonetic features of the head of </> are canceled in </>', and the phonetic 
features tTv of the head of v are replaced by in v'. 

(me. 3) merge{v, 4>) = \<^v' , </>' ] if u is simple and has feature X=, 

where v' and (j)' are expressions resulting from v and <f>, respectively, by 
deleting the feature the respective head label starts with. In addition the 
phonetic features of the head of </> are canceled in </>', and the phonetic 
features tTi, of the head of v are replaced by 7r,^7r„ in v'. 

(me. 4) merge{v,(j)) = [>0',u'] if v is complex and has feature "x, 

where v' and (j)' are expressions as in case (me.l). 

(mo) The function move is a partially defined mapping from Exp{V U Cat) to 
Exp{V U Cat). An expression v belongs to Dom(move) in case that v has 
feature +x or +X G licensors, and v has exactly one maximal subtree (j) 
that has feature — x G licensees. Then, 

(mo.l) move{v) = if u has feature +X 

Here v' results from v by deleting the feature +x from u’s head-label, while 
the subtree </> is replaced by a single node labeled e. </>' is the expression 
resulting from (p just by deleting the licensee feature — x that (j>’s head-label 
starts with. 

(mo. 2) move{v) = if u has feature +x 

Here v' results from v by deleting the feature +x from v’s head-label, 
while within the subtree 4> all non-phonetic features are deleted, p' is the 
expression resulting from p by deleting the licensee feature — x that p’s 
head-label starts with, and all phonetic features that appear in p. 

A feature of the form "X, X" or +X is called strong, one of the form "x or +x is 
called weak. A strong selection feature =X or X= triggers (overt) head movement, 
i.e. incorporation of the phonetic head-features of a possibly complex expression 
into the selecting head (cf. (me. 2), (me. 3)). A strong licensor +X triggers overt 
(phrasal) movement, also called pied-piping (cf. (mo.l)). A weak licensor +x 
triggers covert (phrasal) movement (cf. (mo. 2)). 

Example 3.3. Assume G 2 to be the MG for which 1 = 0 and P = {/ a.i/ , /a. 2 /}, 
while base = {c} U {bi, b 2 , ci, C 2 , di, d 2 }, select = {=bi, =b 2 , =ci, =C 2 , =di, =d 2 }, 
licensees = {— li, — 12 } and licensors = {+Li, +L 2 }, and while Lex consists of 
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do = c 7i = -b2+LiCi-li/a2/ i5i = -b2+Lidi Co = “d2C 

/?i = bi-li/a2/ = =C2+LiCi-li/a2/ = =C2+Lidi 

/?2 = ~bib2 — 12/a.i/ 72 = “C1+L2C2 — 12/ai/ 1^2 = “di+L2d2 

Then e.g. move{merge{'j2,'move{merge{'ji,merge{f32,Pi))))) G Exp{V U Cat). 

Let G = {V, Cat, Lex,T) be an MG. Then CL{G) = UfceiN CL^{G) is the closure 
of Lex (under the functions in T). For fc G IN the sets CL^{G) C Exp(V UCat) 
are inductively defined by 

(Cl) CL°{G) = Lex 
(C2) CL'=+i(G) = CL'^iG) 

U {merge{v, (jf) \ {v, (jf) G Dom{merge) fl CL^{G) x CL^{G)} 

U {mo'i;e(u) | v G Dom(moue) fl CL^{G)} 

Each T G CL{G) is called an expression in G. Such a r is complete (in G) if its 
head-label is in {c}P*7* and each other of its leaf-labels is in P*I* . Hence, a 
complete expression has category c, and this instance of c is the only instance 
of a syntactic feature within all leaf-labels. 

The (phonetic) yield Y{t) of an expression r G Exp{V U Cat) is the string 
created by concatenating r’s leaf-labels “from left to right” and stripping off 
all non-phonetic features. L{G) = {T(r) | r G CL{G) with r is complete} is the 
(string) language (derivable by G) and is called a minimalist language (ML). 

Example 3.4- Consider the MG G 2 from Exa,mr)le l,S.,S[ Let = merge{fi 2 , Pi), 
= merge{'-yi,T^^'^) and = merge{Si,T^^'^). For A: G IN with fc yf 0 define 

^(2k+i) ^ move{T^^'^'>), = merge{-j2,T^'^''~^^), = merge{j[,T'''^'^~'''^'>), 

y(2k+i) _ ijiove{v^'^^^), = mer5e((52, = merge{6[,T^'^^~^^^) 

and (j)Gk+2) _ merge{(o,v^'^'^^^'') . Then we have 

CL\G 2 )\CL°{G 2 ) = {t^^^} and CL^{G 2 )\CL^{Gi) = 
while for A: G IN \ {0} and — 1 < i < 1 we have 

(77^4fc+z((72) \ GL‘‘'=+*-1(G2) = {rCk+i) ^^{4k+^)y 
G2,4fc+2(G2) \ GL^'=+1(G2) = {r(4fe+2)^^;(4fe+2)^^(4fe+2)|^ 

The set of complete expressions in G 2 is {do} U \ A; G IN, A: yf 0}, and 

the language derivable by G 2 is {/ai/”/a 2 /" | n G IN}. 

Definition 3.5. For each MG G = (V, Cat, Lex, E), an expression r G CL{G) 
is called relevant (in G) if it has property (R). 

(R) For any — x G licensees there is at most one maximal proper subtree r_x 
of T that has feature — x0 

We take Rel(G) to denote the set of all relevant expressions r G CL{G). 



In fact, this kind of structure is characteristic of each r G CL{G) involved in creating 
a complete expression in G as will become clear immediately. 
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Let G = {V, Cat, Lex, T) be an MG and consider RCL{G) = RCL'^{G), 

a particularly restricted closure of G, the relevant closure (of G). For A: C IN the 
sets RCL^{G) are inductively defined w.r.t. Rel{G) by 

(Rl) RCL°{G) = {TGRel{G)\TGLex} 

(R2) i?CL'=+i(G) 

= RCL>^{G) 

U {merge{v, 4>) S Rel{G) \ {v, 4>) G Dom{merge) fl RCL^{G)xRCL^{G)} 
U {move{v) G Rel{G) \ v G Dom(moue) fl i?GL^(G)} 

Lemma 3.6. If t G CL^{G) fl Rel{G) for some fc G IN, then t G RCL^{G). 

A proof of Lemma [t.H can straightforwardly be obtained by an induction on 
fc G inO On the other hand, it is an immediate consequence of the respective 
definitions that RCL^(G) C CL^{G) (1 Rel{G) for each A: G IN. Thus, 

Proposition 3.7. Rel{G) = RCL{G). 

Consequently, since each complete r G CL{G) has property (R), we can fix 
Corollary 3.8. L{G) = {y(r)|T G RCL{G) with r is complete}. 

This points out, why it is reasonable to call RCL{G) the relevant closure (of G). 



Remark 3.9. For G 2 as in Example 13.41 RCL^{G 2 ) = CL^{G 2 ) for each A: G IN. 

4 Weak Generative Power 

Let Gmg = {V, Cat, Lex, T) be an MG with {— li | 1 < i < m} an enu- 
meration of licensees for some m G IN. We will construct an m-|-2-MCFG 
G = {N, O, F, R, S) that derives the same language as Gmg IGorollarv 14.511 . 

Thus, in G the start symbol S will derive exactly those strings of phonetic 
features that are the yield of some complete r G GL(Gmg)- In order to achieve 
this, G will operate w.r.t. equivalence classes of a finite partition of RC L{Gmg) 
rather than on single expressions. For each r G RC L{Gmg) there will be some 
nonterminal T G N coding r’s structure as it matters to merge and move, but 
ignoring non-syntactic features (cf. (D1),(D2)). r’s phonetic yield will be sepa- 
rately coded by some pt G O, a, finite tuple of strings of phonetic features, that 
takes into account the structural information stored in T (cf. (D3),(D4)). pT will 
be derivable from T in G as a finite recursion on functions in F, since for each 
particular application of merge or move in Gmg there will be some nontermi- 
nating rule in R simulating the corresponding structure building step in Gmg 

^ Recall that move{r) is defined for r G CL{G) only in case that there is exactly one 
maximal subtree of r that has a particular licensee feature allowing the subtree’s 
“movement into specifier position.” 



186 



J. Michaelis 



(Proposition ESIlfl Vice versa, whenever some pt € O will be derivable in G 
from some T G N that is different from S, there will be some r € RC L{Gmg) 
to which T and px correspond as outlined above ('Proposition 14.41) . 

W.l.o.g. we may assume the head-label of each r S Lex to contain at least some 
category feature x S 6 aseH Moreover, w.l.o.g. we may assume each r G Lex to 
be simple (a head). Thus, we can identify r with its head-label. Doing so, for 
technical reasons we define sets suf(Cat) and suf(— 1 ^) for 1 < i < m by 

suf(Caf) := {k G Gat* \ ex. k' G Gat* and ttl G P*I* with k'kttl G Lex} 
suf(— li) := {k G suf(C'ot) I K = e or K = — l^A for some A G Gat*} 

By (N3) each suf(— 1^) as well as suf(C'at) is finite, and suf(— 1 ^) C licensees*. 
Furthermore, we define 

Rra '■= {ii - . .fy I n G IN, zi, . . . ,fy G {1, . . . , m} with ij fy fy if j fy k} 

Note that is finite, because in particular |o| < m for each a G Rm- Finally, 
we take strong, weak, overt, covert, true, false, sim and com to be pairwise 
distinct new symbols and now give the formal definitions of N and O, the set 
of nonterminals and the set of tuples of terminal strings, respectively, while we 
motivate these definitions in more detail, afterwards (cf. Definition 14. 1 j) . 

• Each nonterminal T G N is either the start symbol S or an m+2-tuple of the 
form /fy, . . . , p^, t) with t G {sim, com} and /r) a triple {pi, ai, a^), where 

(nl) po G suf (Cat) with /tq fy e and oq G (strong, weak}, 

(n2) Pi G suf(— li) and G (overt, covert, true, false} for 1 < z < to, 

(n3) G (1, . . . , to}* for 0 < z < to with aoQ^i ■ • ■ ctm G Rm 

such that for 1 < j < to, in addition, (n4) and (n5) hold. 

(n4) If aj fy e then = j3j^ for some 0 < z < to, z fy j, and /?, 7 G (1, . . . , to}*. 
(n5) pj fy e iff aj fy false iff = [3j^ for some 0 < z < to, /3 , 7 G (1, . . . , to}*. 

Take Oy to be the following binary relation on (0, 1, ... , to} induced by the a^’s: 
(<1 t) i j iff Oii = /3j 7 for some /3 , 7 G (1, . . . , to}*. 

Hence, if zOtJ then i ^ j , pi ^ e and fy false by (nl), (n3)-(n5). Let < 1 ^ and 
< 5 ^ denote the transitive and the reflexive, transitive closure of <t, respectively. 
Then take -<x to be the following binary relation on (0, 1, . . . , to}: 

(^t) j A: iff Qfj = Pf^k'S 

for some 0 < z, j', k' < m and /3, 7 , 5 G (1, . . . , to}* such that k' <\}.k. 

® Note that for each relevant r G Lex there will be two terminating rules T ^ pr G R 
with T G N and pr G O coding r as just mentioned (cf. (r5)). 

® Recall that we are actually interested in complete expressions in CL(Gmg), created 
from expressions in Lex by a finite number of applications of merge and move. 
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It is easy to verify that the set N is in fact finite. Disregarding non-syntactic 
features, we can use N to characterize the relevant expressions in GmGj which 
constitute the set RCL(Gmg) by ProDOsition ld.7l This set is generally not finite, 
itself. The phonetic yield of an expression from RC L{Gmg) can be character- 
ized then as a particular tuple from (p*)™+2 depending on a corresponding 
nonterminal from N. 

• We let O = P the set of phonetic features in Gmg- 

Consider t S RGL(Gmg)- For 1 < i < m take, if existing, to be the unique 
maximal proper subtree of t that has licensee —1^03 Otherwise, take Ti to be a 
single node labeled e. Set tq = t and for 0 < i < m let denote the root of n. 

Now, let T = (pq, III, € N with t € {sim, com} and Jii = {fii, at, a^) 
for 0 < i < m according to (nl)-(n5), let pt = {t^h, ttq, tti, . . . , tt^) G 

Definition 4.1. The pair (T,pt) corresponds to t if (D1)-(D4) are true. 

(Dl) For 0 < i < m, Pi is the prefix of r^’s head-label consisting of just the 
syntactic features, and f = sim iff r is simple. 

(D2) For 0 < i,j < m with pi, pj ^ e, i<i^ j iff ri r^, and i -<t j iff D -<t ^j- 

(D3) If Co = weak then ttjj = e and ttq is the phonetic yield of tq = r except for 
each substring that is the phonetic yield of some Ti with 1 < * < m and 

0 <3^ i such that there is 1 < j < m with 0 <i^ j i and aj = overt. 

If oo = strong then tth consists of the (ordered) phonetic features tt 
of the head-label of tq = r, while ttq is as in case oq = weak but lacking 
the substring tt. 

(D4) For 1 < i < m, if Ui G {covert, true, false} then tt^ = e. If Ui = overt 
then TTi is the phonetic yield of p except for each substring that is the 
phonetic yield of some tj with 1 < j < m and i <i^ j such that there is 

1 < k < m with i <iy /c j and Ofc = overt. 

Note that (Dl) provides a method to install a finite partition V on RGL{Gmg)- 
In the given manner, to each r G RG L{Gmg) exactly one element belonging to 
the product suf (Gat) x suf (— li) x . . . x suf (— 1^) x (sim, com} can be assigned 
(D2) can be seen then as introducing a refinement of V: Expressions r 
from one equivalence class are distinguished w.r.t. proper dominance, and 
precedence, -<r, as it holds between each two distinct maximal projections and 
rj whose head-labels start with some licensee — 1^ and — Ij, respectively. This 
can be achieved by assigning to each r G RG L{Gmg) a particular m + 1-tuple 
(ooi cn, . ■ . , ctm) with tti G {1, ... , to}* for 0 < t < to according to (n3)-(n5). 

Again let T = {'pq, . . . ,'pm,t) G N with t G (sim, com} and pi = (pi,ai,ai) 
for 0 < * < TO as in (nl)-(n5), let px = (tt//, tto, . . . , tt^) G (P*)™+2 such 
that (T,pt) corresponds to r G RGL(Gmg) according to Definition 14.11 For 
0 < i < m each pi and Ui as well as t is unique, because (Dl) and (D2) hold. 

Recall fn. El 

As a finite product of finite sets this product is also a finite set. 



188 



J. Michaelis 



For each possible combination of a^’s, 0 < i < m, there is exactly one pt that 
satisfies the requirements of (D 3 ) and (D 4 ). 

The /ii’s, the Oi’s and t determine the equivalence class of r w.r.t. the refined 
partition Vref on RCL(Gmg)- Since either oq = strong or gq = weak, we have 
added the possibility to respectively code whether the category x (of the head- 
label) of T has to be selected by strong =X or X= or by weak =x. For 1 < j < m we 
have Gi — false iff there is no subtree Ti that has licensee — 1 ^. By = overt, 
Gi = covert or Gi — true we are able to respectively code, whether we expect the 
maximal subtree Ti with licensee — 1 ^ to move overtly, covertly or just to move in 
a later derivation step. In this sense, according to (D 3 ) and (D 4 ), for 0 < i < m 
the component of px specifies the “non-extractable” part of the phonetic yield 

of Ti, i.e. no overt movement can apply such that a proper subconstituent of Ti 
is extracted pied piping some (proper) subpart of Recall that tq = r, here. 

Example 4 - 2 . Let the MG G2 be as in Fxa.mple 1 , 4.41 Consider the partition V on 
RGL{G2) induced by suf(C'at) x suf(— li) x suf(— 12) x {sim, com}. In case of G2 
the corresponding refinement is identical with V. RG L{G2)\RG {G2) , the 
set of complex expressions belonging to RGL{G2), divides into ten equivalence 
classes. One of which is finite, namely represented by (b2— 12, — li, e, com). 

The other classes and their respective representatives are 

|.j.( 4 fc+ 2 ) I ^ g li , — li , — 12 , com), 

I /c S IN} and ( +Lidi , — li , — 12 , com), 

|T-( 4 fc-i) I fc g IN, 0 } and (ci— li , e , — 12 , com), 

I g IN, A: ^ 0 } and ( di , e , — 12 , com), 

I fc g IN, fc yf 0 } and (+L2C2-I2 , -li , -I2 , com), 

|.y( 4 fe) I ^ g IN, A: ^ 0 } and ( +L2d2 , e ,— l2,com), 

{.^( 4 fc+i) I A: g IN, fc yX 0 } and ( C2— 12 , ~li , £ , com), 

}.y( 4 fe+i) I A: g IN, fc 0 } and ( d2 , e , e , com), 

and finally | fc € IN, fc y^ 0 } and (c,e, e, com). Now, let N2 be the 

nonterminal set according to (nl)-(n 5 ) for G2 and consider e.g. 

T = ((+L1C1 — li, weak, 2 ), (— li, overt, e), (—I2, overt, 1 ), com) g N2, 

U = ((+Lidi, weak, 2 ), (— li, overt, e), (— 12, overt, 1 ), com) g N2 

and PT = (e, /a2/,/a2/'"+\/ai/'=+i), pu = (e, e, /a2/'"+\ /ai/'=+i) with fc g IN. 
Then {T,pt) and {U,pu) correspond to and t;( 4 fe+ 2 )^ respectively. 

Turning back to the general case of the MG Gmg> for the corresponding LGFRS 
G we will now define the set F of functions, manipulating tuples of tuples of 
terminal strings, and the set R of rewriting rules. In particular for all r, v and 
(j) S RGL{Gmg)^ if r = merge{v, (f)) then rules of the form T — >■ mergejj y{U, V) 
and sometimes T — >• Mergejjy{U,V) will belong to R (cf. (rl),(r 2 )), where 
mergcuy and Mergejjy € F will be applicable to the pair (pu,Pv) resulting 
in Pt- Similarly, if r = move{v) there will be rules T — > moveu{U) and some- 
times T — >■ Moveu{U) G R (cf. (r 3 ),(r 4 )), where moveu and Movejj G F will 



Derivational Minimalism Is Mildly Context-Sensitive 



189 



be applicable to pu calculating pr as value. Here we have T, U and V € N, 
while pt, Pu and pv & O such that (T,pt), {U,pu) and (V,pv) respectively 
correspond to r, v and 4> in the way given with Definition 14.11 

• The set F of functions and the set R of rewriting rules are simultaneously 
defined w.r.t. the occurrence oi an f G F within an r G R. 

N onterminatinq rules : First of all we define two initial rules by 

(rO) S — >■ con{T) G R for T = {^ 0 ,^. 1 , . . . , Pm, t) G N 

with po = (c, weak, e), pi = (e, false, e) for 1 < i < m and t G {sim, com}. The 
concatenation function con : p* is given by a; 1 — >■ xhXqXi ...Xm, 

where x denotes the m+2-tuple {xh, xq, xi, . . . , Xm) consisting of the variables 

For X G base suppose that 

xA G suf(C'at) with A G Cat*, i.e. A G licensees* , 

SK G suf(Cat) with s G {=x, =X,X=| and k G Cat*, 

Ui, G suf(— li) for 1 < z < 771 with Ui = e or = e 

such that for 1 < j < m, 

Uj = = e if X = —IjX' with A' G Cat*. 

We choose bg, Cq G {strong, weak}, G {overt, covert, true, false} for 

1 < z < TO, /3i, 7 i G {1, . . . , to}* for 0 < z < to, and u,v G {sim, com} such that 

U = ((sk,6q,/3o), (zzi,&i,/3i) , . . . , {yra^ Pm ),u) G N, 

V = ((xA,Co,7o),(6,Ci,7i),...,(^m,Cm,7m)w) G 

and such that, additionally, 

if s G {^x} then cq = weak, 

if s G {"X,X"} then cq = strong and u = sim. 

Proceeding, if A = e we set j = 0 and take 




whereas, if A = — IjA' for some 1 < j < to and A' G Cat* we take 




where for 1 < z < to we have 




b^, (ii) if z yf j and = e 

te,Ci, 7 i) if z j and e 
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= (A, covert, 7 o)l 

^ if * = j for j ^ 0 
= (A, overt, 7 o) J 



Then, for mergejj y and Mergejj y € F as defined below, we finally let 
(rl) T' — >■ mergejj y(U,V) G R , and 

(r2) T” — >• Mergejjy(U, V) G R if A = — IjA' for 1 < j <m and A' S Cat* . 



Take x and y to be the m+2-tuples {xh,xo,xi, . . . , Xm) and {yH,yoj 2/i> • • ■ j J/m) 
consisting of the variables xh, xq, Xj, . . . , Xm and yn, yo,yi, ■ ■ ■ , ym, respectively. 

The function mergejj y : (p*)™+2 x (p*)™+2 (^p*^m +2 jg defined by 

{x,y) {xH,xo,Xiyi , . . . , Xrtiym^ 



with < 



Xh = xnyH , xo = xoyo 
Xh = xhVh , Xo = yoxo 
XH = xnyn , xo = xoyo 
XH = XH ,xo = xoynyo 
Xh = ynXH , Xo = Xoyo 
Xh = XH ,xo = ynxoyo 



in case s = "x , u = sim 
in case s = "x , u = com 
in case s = "X , = strong 

in case s = "X , fog = weak 
in case s = X" , 6 q = strong 
in case s = X" , 6 q = weak 



The function Mergejj y : (p*)™+2 x (p*)™+2 ^p*^m +2 jg defined by 

(x,y) H> {xH,xo,xiyi, . . . ,Xj-iyj-i,Xjyjyo,Xj+iyj+i,. . .,Xmym) 



with < 



Xh Xnyn i xo 

Xh Xnyn ? xo xo 
Xh Xh j Xo xoyn 
Xh ynXH : Xo Xo 
Xh Xh ^ Xo ynxo 



in case s = "x 

in case s = "X , 6 q = strong 
in case s = "X , 6 q = weak 
in case s = X" , 6 q = strong 
in case s = X" , 6 q = weak 



In order to illustrate the way in which G “does its job” concerning the operation 
merge, consider v and 4> G RCL(Gmg) with respective head-labels skC and xXr] 
for some x G base, s G {"x, =X, X=}, k,X G Cat* and some (^, y G P*I* such that 
r = merge{v,(j)) G RCL(Gmg)- Assume U,V G N and pu = {pn, Po, ■ ■ ■ , Pm), 
py = {aH,<xo, . . . ,am) G (P*)™+2 to be such that (U,pu) and (V,py) respec- 
tively correspond to v and (j) in the sense of Definition H.IL Then U and V are 
as in (rl), and also as in (r2) in case A ^ £0 

For T' as in (iT) and px’ = mergejj y{pjj,py), (T',px’) corresponds to r 
in any case. For T” as in (r2) and px" = Mergejj y{pjj ,py), also (T",px>') 
corresponds to r in case that A = —IjX' for some 1 < j < m and A' G Cat*. 
In the latter case, in terms of the MG Gmg> by canceling the category feature 
X from (jj's head-label while merging v and (j) an expression xj that has licensee 



In particular, pjcrj = e in case that A = — IjA' for some 1 < J < m and X' G Cat* . 
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—Ij becomes a proper subtree of r. Up to the deletion of the instance of x, Tj 
is identical with (j). I.e. in particular the phonetic yield of both is identical. In 
a derivation creating a complete expression, Tj must move to check its licensee 
at some later derivation step. In (rl) this later application of move is expected 
to be covert, coded in T' by /I' stating that the j + 2-th component of pt' is 
empty. This chimes in with the definition of mergcu y according to which cto, the 
“non-extractable” part of the yield of </> (i.e. of Tj) specified by V, is “frozen” 
within the 2nd component of px ' , the “non-extractable” part of the yield of r 
specified by T' . In (r2) the later application of move is expected to be overt, 
coded in T" by p,". Here, applying Mergejj y to (pu,Pv), remains a part on 
its own as j + 2-th component of pt", since pjaj = e. 

If s G {"X,X=} then (f) is selected strongly and v is simple. In this case 
Co = strong, and therefore the (ordered) phonetic features a of c^’s head coincide 
with cfHi the 1st component of py. Applying Mergeu y or mergejj y to the pair 
{pUiPv)i will be incorporated into the selecting head v, i.e. concatenated 
with the phonetic features p of u “in the right manner.” Note that in terms 
of the LCFRS G depending on whether the category feature of v is expected 
to be selected strong or weak, i.e. whether bo — strong or bo — weaik, p is 
either pu or po according to (D3) E If s = “X then (p is selected weakly. Thus, 
Co = weak. Therefore, the phonetic features cr of </>’s head are a substring of ao, 
the “non-extractable” part of the yield of p, and ajj = e. 

Now, for some 1 < j < to, suppose that 

Uj G suf(— Ij) with Vj = —IjX for some A G licensees* , 

Ik G suf((7at) with I G {-|-lj, -|-Lj} and k G Cat*, 

Vi G suf(— li) for 1 < * < TO with i ^ j 

such that for 1 < fc < to with k ^ j, 

Vk = e ii \ = — IfeA' with A' G Cat* . 

Choose bo G {strong, weak}, bi G {overt, covert, true, false} for 1 < i < m, 
and Pi G {!,..., to}* for 0 < i < to such that 

U = {{Ik, bo, Po), (vi,bi, Pi) {vm,bm, Pm), com) G N, 

and such that, additionally, 

if I = +Lj then bj G {overt, true}, 

if I = -|-lj then bi G {covert, true} for 1 < i < to with j <1^ i. 

If A = e we set fc = 0 and take 

T' = {{K,bo, PjP),p'i, ,P'm^^°'^) G N, 
if A = — IfeA' for some I < fc < to and A' G Cat* we take 



13 



The existence of both possibilities is granted by the nonterminating rules (cf. (r5)). 
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T' = ((k, bo, kf3),'j2'i, . . . , 'p!^, com) G TV in general, and 

T" = ((k, bo, kj3),p'l, . . . , p'^, com) G in case that bj = overt. 

Here, (3 = CoVo if 0 j, where Co, ?7o G {1, ■ ■ • , "i}* with f3o = CoJVo, and (3 = /3o 
otherwise. Further, if bj = overt then for 1 < i < m we have 

V-'i = {X, covert, (3 j) 

K= (A, overt, /3j) 



\ii = k 



{ (e, false, e) if i = j and j ^ k 

{vi, bi, CiTji) if i <\u j, where Cg G {1, • ■ • , m}* with (3i = Qjrji 
{vi,bi,(3i) otherwise 

and, if bj G {covert, true} then for 1 < i < m we have 



' (A, true, /3j) if i = k 
(e, false, e) if i = j and j ^ k 
= \ true. Pi) if j <+ i 

{vt, h, CirjP if i < 1(7 j, where C*, G (1, . . . , m}* with Pi = C^ijrji 
, {vi,bi,Pi) otherwise 

Now, for the functions moveu, Moveij € F as defined below we let 



(r3) T' — >■ moveu {U) G R in any case, and 

(r4) T" — >• Moveu(U) G i? if bj = overt, A = — l^A' for 1 < fc < m. A' G Cat*. 
Again let x denote the m+2-tuple {xh, xo, xi, . . . , Xm) consisting of the variables 

1 ^0 ; ■ J 



The function moveu ■ — >■ jg defined by 

X I ^ (3^7? 1 Xj X q , Xi , . . . , Xj _i , c , Xj_|_i , . . . , Xttj) 

The function Moveu '■ (P*)™+^ — >• (p*)™+2 jg defined by 



{ {xh,xo, ■ ■ ■ ,Xj-i,Xj,Xj+i , . . .,Xm) for k = j 
{x}j , Xo , • . • ,X}^—\, XjXf^ , , Xj—i , e , Xjj-1 , . . . , x^n) for k j 

{x H 7 Xo 7 ••• 7 ^j—1 7 c 7 Xjj-i , . . . , Xf„— I, Xj x/e,x/^-|_i, . . . , Xjip) for k ^ j 

Let us briefly discuss how the operation move is mimicked by G. Consider r 
and V G RC L{Gmg) for which r = move{v). Hence v has head-label IkC and a 
maximal subtree p with head-label —IjXrj for some f < j < m, I G |+Lj, +I 7 }, 
K,X G Cat* and C, rj G P*I*. For 1 < j < m let, if existing, Vi be the maximal 
subtree of v that has licensee — Ij, otherwise let Vi be the simple expression 
labeled e. Thus, p = Vj. Take U G N and pu = {ph,P 07 ■ ■ ■ 7 Pm) G (p*)™+2 to 
be such that {U,pu) corresponds to v according to Definition H. 11 Then U is as 
in (r3), and also as in (r4) in case A yf e and bj = overt. 
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In case that I = +lj covert movement applies in terms of the MG Gmg- 
Looking at (D4) we see that in terms of the LCFRS G, by the respective U G N 
we ensure that pj = e, but also that Pi = e for each 1 < i < m with j<\^ i. I.e. for 
each Vi that is a subtree of Vj we demand that pi, the “non-extract able” part 
of the yield of Vi, is empty. As for the general MG-definition we must be aware 
of the linguistically rather pathological case that Vj in fact “hosts” some proper 
subtree Vi and at some later derivation step overt movement will apply to Vi 
but with empty phonetic yield. This becomes possible since Vj is moved covertly 
before Vi has been extracted such that Vi’s yield gets “frozen” within Vj’s yield 
which is “left behind.”El After having lost the phonetic features this way, in 
terms of the LGFRS G the component bi gets the value true, which triggers 
equal behavior w.r.t. a strong licensor and its weak counterpart. This reflects 
the fact that in terms of the MG Gmg overt movement of a constituent with 
empty phonetic yield has the same effect as moving this constituent covertly (up 
to leaving behind a “totally empty” structure in the latter case). 

For T' as in (r3) andpT' = moveuipu), corresponds to r in any case. 

For T" as in (r4) and pr" = Moveu{pu), also {T" ,pT") corresponds to r in case 
that bj = overt and A = — IfcA' for some 1 < k < m and A' G Cat*. Whenever 
A = — IfcA' for some 1 < k < m and A' G Cat*, in terms of the MG Gmg an 
expression that has licensee — 1^ becomes a proper subtree of r by canceling 
the licensee — 1^ from (j)’s head-label while moving (jj to specifier position of v. 
In order to derive a complete expression, the licensee of Tk has to be canceled 
by moving Tk at some later derivation step. Thus, we again can distinguish 
two general possibilities O Of course, the corresponding instance of — Ifc can be 
checked overtly or covertly. But, here we pay somewhat more attention than 
in the analogous “merge-case,” since it might be that Vk has already “lost” its 
phonetic yield by a particular application of covert movement at some earlier 
derivation step (see above). According to (D4), only in case that bj = overt 
the corresponding component pj of pu may include some non-empty phonetic 
material, and only in this case we have to state explicitly two cases (r3) and (r4), 
analogous to (rl) and (r2) in the “merge-case.” The later application of move 
is “anticipated” as being covert in (r3), and as being overt in (r4). 

Terminatin g rules : Let ktu G Lex for some n G Cat*, n G P* and b G I* . 
Then, consider oq G {strong, weak} and th, tto G (tt, e| with tth yf ttq such that 
ttq = 7T iff flo = weak. We define two terminating rules by 

(r5) T ^ pt G R 

withT = ((k, ao,e),9i, . . . ,9m,sim) G iV and pt = (tt//, tto, e, . . . , e) G (P*)™+2^ 
where Vi = (e, false, e) for 1 < i < m. 

This case is exemplified by the MG Gcon, where P is {/ei/, / g 2/, /es/}, I is 0 , 
base is (c, ai, a2, as}, select is ("ai, "a2, ^as}, licensor is -|-b2}, licensees is 

{— bi, — b2}. Lex consists of ai— bi/ei/, "aias— b2/e2/, "a2+b2a3/es/ and "as+Bic. 
The language T(Gcon) derivable by Gcon consists of the single string /es//e2//ei/. 
Like in the case when a subtree with licensee — Ij is introduced applying merge. 
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We will continue by proving the weak equivalence of G and C?mg- In order to 
finally do so, we show two propositions in advance. 

Proposition 4.3. Consider t G RCL(Gmg)- Let qo G {strong, weak}, and let 
Qi G {overt, covert} for 1 < i < m. Then there is some T = ({Iq, . . . , fim,t) G N 
with t G {sim, com} and Jli = for 0 < i < m as in (nl)-(n5), and 

there is some pr G (p*)™+2 pj. g Lg{T) such that (a) and (b) hold. 

(a) (T,pt) corresponds to r according to Definition H- 1\ 

(b) ao = go and Oi G {qi, true} for 1 < i < m in case pi ^ e. 

Proof We have RCL{Gmg) = [j k eiN RCL>^{G mg) and Lg{T) = [JkeJN Lg(T) 
for T G N. Showing B) by induction on fc G IN we will prove the proposition. 

ESlc) If qo G {strong, weak} and qi G {overt, covert} for 1 < i < m then 
T G RCL^(Gmg) implies that there are T — {po, . . . ,pm,t) S N and 
Pt G (P*)™+2 with pt G Lq{T) fulfilling (a) and (b). 

Since RCL^{Gmg) = Lex, (f4.3li l holds according to (r5). Considering the induc- 
tion step, let T G RC {Gmg) ■ There is nothing to show if r G RCL^{Gmg)- 
Otherwise, one of two general cases arises. 

Either, there are v and (f) G RCL^(Gmg) with respective head-labels sk( and 
xAry for some x G base, s G {=x, =X,X=}, k,X G Cat* and (, p G P*I* such that 
r = merge{v,4>) holds. Let bo = qo, let cq = strong iff s G {"X,X"}. Now choose 

U = {{sK,bo,Po), (vi,bi,Pi) ),u) G N, 

V = ((xA,Co,Jo),(fl,Cl,Jl),...,(fm,Cm, 7 m),v) G N 

and pu, Pv G that pu G Lq(U), pv G Lq{V), and such that 

{U,pu) and {V,pv) correspond to v and <f>, respectively. Here u,v G {sim, com}, 
G suf(— Ij), bi,Ci G {overt, covert, true, false} for 1 < i < m, and 

Pi,7i G {1, . . . , to} for 0 < i < TO. In particular, each and fi for 1 < i < to 

is unique. By induction hypothesis U, V and pu, py not only exist, but for 
1 < i < TO they can also be chosen such that bi G {gi,true} for i^i ^ e, and 
Ci G {g*,true} for ^i ^ e. 

Recalling that merge is defined for the pair (v, (jj), we conclude that u = sim if 
s G {=X,X=}. Because, merge{v,cj)) G RCL{Gmg) we also have Vi,^i G suf(— 1^) 
for 1 < i < TO with Vi = e or i^i = e such that Vi = fi = t \i \ = — l^A' with 
A' G Cat* . Therefore, U and V are as in (rl) in any case, and also as in (r2) in 

case that A e. Hence (rl’) is true in any case, and (r2’) in case A e. 

(rl’) T' mergejjy{U,V) G R and pr' = mergejjy{pu,Py) G L^^{T'), 

(r2’) Mergeuy{U, V) G R and px" = Mergeuy{pu,Pv) G L'ff^(T") 

with T' G N and mergcjjy G E as in (rl), T” G N and Mergcjjy G F as in (r2). 

Let T = T” and px = Pt" m case that qj = overt and A = — lyA' for some 
1 < } < TO and X' G Cat* . Otherwise let T = T' and px = Pt"- Comparing 
the definition of merge G T to the definitions of T and mergcuy or Mergejjy, 
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respectively, we see that (T,pt) corresponds to r = merge(v, 4>), and that T also 
satisfies the conditions imposed by (b). 

The second general case provides an r; G RCL^{Gmg) for which r = move{v). 
Thus, V has head-label Ik(^ and a maximal subtree (j) with head-label — IjAry for 
some I < j < m, I G {+Lj,+lj}, k, A G Cat* and C, rj G P*I*. For bo = qo, by 
induction hypothesis we can fix existing 

U = {{Ik, bo, Pi), Pm), com) G N, 

and pu G (p*)™+2 with pu G Lq{U) such that {U,pu) corresponds to v. Again 
we have Vi G suf(— Ij), bi G {overt, covert, true, false) for 1 < i < m, and 
Pi G (1, . . . , m} for 0 < i By induction hypothesis, for all 1 < j < m with 

Pi ewe can choose U even such that bj G {overt, true} and bi G {q^, true} for 
* yf j in case I = +Lj, and such that bj G {covert, true}, bi G {covert, true} 
for jOy i and bi G {qi, true} in case I = +ly. Because move{v) G RCL{Gmg), we 
conclude that (r3’) holds in any case, and (r4’) in case that A e and bj = overt. 

(r3’) T' — >■ moveij{U) G R and pt' = moveu{pu) G L’^^{T') 

(r4>) Moveu{U) G R and pT» = Moveu{pu) G L'^P^T") 

with T' G N and moveu G F as in (r3), T" G N and Movejj G F as in (r4). 

Let T = T” and px = Pt" in case that bj = qu = overt and A = — IfcA' for 

some 1 < k < m and A' G Cat* . Otherwise let T = T' and px = px' ■ Looking at 

the definition of move G T and the definitions of T and moveu, v or Moveu, v, 
respectively, we see that {T,px) corresponds to r, and that also (b) is true. □ 

Let T G N andpT G (p*)™+2 be such that (a) and (b) of Proposition ^21 are true 
w.r.t. given r G RCL{Gmg), Qo G {strong, weak} and qi G {overt, covert} for 
1 < i < TO. Note that this does not automatically imply that px G Lq{T). 

Proposition 4.4. Ifpx is a T -phrase in G, i.e. ifpx G Lo{T) for some T G N 
with T p S and px G then there is some r G RCL{Gmg) such that 

{T,px) corresponds to r according to Definition o 

Proof. Recalling again that RCL{Gmg) = UfeeiN holds as well as 
Lg{T) = UfeGiN ^gC^)’ prove this proposition by induction on fc G IN. 

I4.4L 1 IfpT G Lq{T) then {T,px) corresponds to some r G FC'F^(Gmg)- 

Since Lex = RCL^{Gmg), H4.4h l holds according to (r5). Considering the in- 
duction step, suppose that E3L) is true for A: G IN. The crucial case arises from 
Px G \ Lq{T) dividing into two general possibilities. 

Either, U, V G N and pu, Pv G exist with pu G Lq{U), pv G Lq{V). 

U and V fulfill the restrictions applying in (rl) such that (rl”) is true for T' G N 
and mergcuv G F as in (iT), or U and V even satisfy the restrictions applying 
in (r2) such that (r2”) is true for T" G N and Mergcuv G F as in (r2). 

Recall that each Ui for 0 < i < m and each Pi for 0 < i < m is unique. 
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(rl”) T — >• mergeijy{U,V) G R , pr = ’>nergejjy{pu,Pv) and T =T' 

(r2”) T Mergejj yiU, V) G R , pr = Mergejjy{pu,pv) and T = T" 

Then, by induction hypothesis there are v and (j) G RCL^{Gmg) such that 
(U,pu) and (V,py) respectively correspond to v and (j) in the sense of Definition 
14. II Recall the restrictions that apply to U and V in (iT) or (r2), respectively. 
Because of these restrictions we may conclude that r = merge{v,(j)) is not only 
defined according to (me), but also in RCL^~^^{Gmg) according to (R2). Since 
(rl”) or (r2”) is true, we refer to the respective definitions of T' and mergejjy 
or T" and Mergejj y to see that (T,pt) corresponds to t. 

Secondly, U G N and pu G Yciay exist with pu G Lq{U). The restric- 

tions given with (r3) apply to U and (r3”) holds for T' and moveu G F as in 
(r3), or even the restrictions given with (r4) apply to U and (r4”) holds for T” 
and Moveu G F as in (r4). 

(r3”) T — >■ moveu (U) G R , pt = moveu (pu) and T = T' 

(r4”) T — )> Moveu{U) G R , pt = Moveu{pu) and T = T" 

Here, by hypothesis there is an u G RGL^{Gmg) such that (U,pu) corre- 
sponds to V in the sense of Definition 14.1 1 Similar as for (rl”) and (r2”), in cases 
(r3”) and (r4”) it is straightforward to show that move G F is defined for v, and 
that {T,pt) corresponds to r = move{v) G RGL^^^(Gmg)- D 

Corollary 4.5. tt G L{G) iff G L(Gmg) for each tt G P* . 

Proof. As for the “if’-part consider complete r G GL{Gmg) with phonetic yield 
7T G P*. Let T = (jIq, . . . G N with t G {sim, com} and pi = (pi,ai,ai) 

for 0 < J < TO as in (nl)-(n5), let pr = ■ ■ ■ ,r^m) G (p*)™+2. Assume 

that (T,pt) corresponds to r according to (D1)-(D4). By Proposition 14. ,41 these 
T and pr exist even such that px G Lg{T) and ao = weak. Since r is complete, 
Pq = (c,weak, e) and pi = (e, false, e) for 1 < * < to by (Dl), and therefore 
TTi = ... = 7Tm = e by (D4) . Moreover, r’s phonetic head-features are “at the 
right place,” i.e. tth = e and ttq = tt by (D3). Looking at (rO) and (L2), we 
conclude that tt G Lg{S) = L{G). 

To prove the “only if’-part, we start with some tt G L{G) = Lg{S). The 
definition of R yields that each rule applying to S is of the form (rO). Thus, 
according to (L2) there is some px = (tt//, tto, . . . , tt^) G (p*)™+2 g^ch that 
Px G Lg{T) and tt = con{px) for T G N as in (rO). (T,px) corresponds to some 
r G RGL(Gmg) by Proposition 14.41 This r is complete by (Dl), tt is the yield 
of r, since tth = tti = . . . = = e and ttq = tt by (D3) and (D4). □ 

Consider the to -|- 2-LCFRS G as constructed above for a given MG Gmg whose 
set of licensees has cardinality to G IN. 

If all licensors in Gmg 8.re strong, i.e. only overt movement is available, we 
do not have to define productions of the form (rl) and (r3) in case A e for 
the corresponding A G licensees* . More concretely, whenever in terms of the MG 
Gmg a subtree that has licensee — x arises from applying merge or move, in 
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terms of the LCFRS G we do not have to predict the case that this licensee will 
be canceled by “covert movement.” Moreover, according to (D2), the structural 
relation of any two subtrees with different licensees is of interest only in (r3) for 
A e. Since productions of this kind are of no use at all, assuming all licensors 
in Gmg are strong, each 'jli = (fj,i,ai,ai) of some T = (jlg, . . . G N 

according to (nl)-(n5) can be reduced to its 1st component fii without loosing 
any “necessary information.” This means that expressions from RC L{Gmg) in 
terms of the LCFRS G have to be distinguished only w.r.t. the partition V 
induced by suf(Gat) x suf(— li) x . . . x suf(— 1^) x {sim, com}. 

In case that all selection features in Gmg are weak, G is reducible even 
to an m + 1-LCFRS. This is due to the fact that the 1st component of any 
Pt € appearing in some complete derivation in Gmg is necessarily 

empty in this case. Therefore, if additionally m = 0, Gmg is a CFG. Vice versa, 
each CFG is weakly equivalent to some MG of this kind. This can be verified 
rather straightforwardly e.g. by starting with a CFG in Chomsky normal form. 



5 A Hierarchy of MGs 

Several well-known grammar types constitute a subclass of MCSGs. There are 
a.o. the two classes of head grammars (HGs) and TAGs as well as their general- 
ized extensions, the classes of LCFRSs and multicomponent TAGs (MCTAGs), 
respectively^ Like HGs and TAGs, LCFRSs and MCTAGs are weakly equiva- 
lent. LCFRSs and MCTAGs are the union of an infinite hierarchy of grammar 
classes, the respective hierarchy of m-LCFRSs and m-TAGs (m G IM \ {0}). 
It is known that each m-LCFRL is an m-TAL, a language derivable by some 
m-TAG, and that each m- TAL is an 2m-LCFRL (cf. 0). We can introduce an 
infinite hierarchy on the MG-class, as well. 

Definition 5.1. For each m G IM an MG G = (V, Gat, Lex, T) according to Def- 
inition E3 is an m-minimalist grammar (m-MG) if the cardinality of licensees is 
at most m. Then, the ML derivable by G is an m-minimalist language (m-ML). 

Let m G IN. It is clear that each m-ML is also an m -|- 1-ML. 

In Sect. 0 we have shown that each m-ML is an m -I- 2-LCFRL. This result 
can be strengthened for m = 0, since the inclusion of 1-TALs within 2-LCFRLs 
is known to be proper (cf. 0). Due to its “restricted type,” the 2-LCFRS that 
we have constructed for a given 0-MG can be transformed to a weakly equivalent 
1-TAG. Thus, each 0-ML, each language whose realization plainly relies on the 
“extended” merging-type allowing for overt head movement, is even a 1-TAL, a 
tree adjoining language. Indeed the class of 0-MLs is a proper extension of the 
class of GFLs. Referring to the rather categorial type logical approach of 
presents a 0-MG that derives the copy language {ww | re G {1, 2}*}. 

We define an MCTAG as in 0 and call it an m-TAG if derived sequences of auxiliary 
trees can be (simultaneously) adjoined to elementary tree-sequences of length at 
most m G IN \ {0}. Then, 1-TAGs are TAGs in the usual sense, and vice versa. 
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Generalizing Example l,S.,SI for m G IN we consider the m-MG Gm with 
I = ^, P = {ls.il I 1 < i < m} and base = {c} U {bi,Ci,di | 1 < t < m}, 
while select = {"b^, | 1 < i < m}, licensees = {— li | 1 < * < rn} 

and licensors = j+L^ | 1 < t < m}. Lex consists of the simple expressions c 
and bi-li/a^y, further =bjbi+i-li+i/a™_i/, =Cj+Lj+iCi+i-li+i/a„_i/ and 
=di+Li_|_idi+i for 1 < i < m, finally the 5 expressions =bm+LiCi— li/a™/, 
=bm+Lidi, =Cm+LiCi— li/a™/, =Cm+Lidi and "d,„c. Gm derives the language 
{/ai/" . . . f&ml^ I n G IN}. We omit a proof here, pointing to the rather “deter- 
ministic manner” in which expressions in Gm can be derived. 

Proposition 5.2. For each m G IN, {a" . . . Om \ n G IN} is an m-ML. 

As shown in [^, for each m G IN \ {0}, {o” . . . a^m I ^ G IN} is an m-LGFRL, 
while {a" . . . al^m+i I ^ ^ I^} is not. Because each m-ML is an m -I- 2-LGFRL, 
we therefore conclude that the hierarchy of ML-classes is infinitely increasing, 
i.e. there is no m& G IN such that for all m G IN each m-ML is also an m(,-ML. 

6 Conclusion 

We have shown that MGs as defined in constitute a weakly equivalent (sub)- 
class of MGSGs as described in e.g. jSj. Thus, the result contributes to solve a 
problem that has remained open in Further, we have established an infinite 
hierarchy on the MG-class in relation to other hierarchies of MGSG- formalisms. 
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Abstract. Tree descriptions based on dominance constraints are 
popular in several areas of computational linguistics including syntax, 
semantics, and discourse. Tree descriptions in the language of context 
unification have attracted some interest in unification and rewriting 
theory. Recently, dominance constraints and context unification have 
both been used in different underspecihed approaches to the semantics 
of scope, parallelism, and their interaction. This raises the question 
whether both description languages are related. In this paper, we 
show for a first time that dominance constraints can be expressed in 
context unification. We also prove that dominance constraints extended 
with parallelism constraints are equal in expressive power to context 
unification. 

Keywords. Gomputational linguistics, underspecification, tree de- 
scriptions, computational logics, unification theory. 



1 Introduction 

Logical tree descriptions are popular in many areas of computational linguistics 
and computer science. They are used to model data structures in logic pro- 
gramming, to reason with propositions and proofs in automated deduction, and 
to represent all kinds of syntactic or semantic structures in computational lin- 
guistics. In this paper, we investigate the relationship between tree descriptions 
based on dominance constraints and those in the language of context unification. 

Two Languages of Tree Descriptions. Dominance constraints are popular for 
describing trees throughout computational linguistic. In syntax, they serve for 
deterministic parsing HH| and to combine TAG and unification grammars m- 
In underspecified treatments of scope ambiguities, variants of dominance con- 
straints appear somewhat implicitly in many places EHEI and explicitly in two 
recent approaches UnEH]. Even more recently, dominance constraints have been 
applied to discourse semantics El, and they have been used to model informa- 
tion growth and partiality m- 

In general, the problem of solving dominance constraints is NP-complete El 
Nevertheless, jO| describes an implementation of a dominance constraint solver 
which runs efficiently on practical examples from scope underspecification and 

M. Moortgat (Ed.): LACL’98, LNAI 2014, pp. 199-|T|3 2001. 
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discourse. This solver is implemented based on finite set constraints in the Mozart 
System m the most recent implementation of the concurrent constraint pro- 
gramming language Oz m- 

Context unification (CU) was introduced in rewriting and unification theory [3 
EOl. CU can be considered as second-order linear unification PEI, which is a 
restriction of higher-order unification, or as an extension of string unification 
m- The decidability question for CU is a prominent open problem m- A 
decidable fragment of CU called stratified unification has been used to show 
the decidability of distributive unification m and for solving one-step rewriting 
constraints P12E!. It is shown in m that context unification with two context 
variables - each of which may occur an arbitrary number of times - is decidable. 
The proof is by reduction to string unification, which is decidable according to 
Makanin’s famous result jl Yi;-i4) . 

Tree Descriptions in Semantic Underspecification. Recently, tree descriptions 
based on dominance constraints and context unification have been proposed for 
the same application to natural-language semantics 1101241141 . There, the goal 
was to find a uniform language providing underspecified representations for the 
semantics of scope, parallelism, anaphora, and their interactions (for a survey 
of semantic underspecification, see e.g. [B|). The common characteristic of both 
approaches is that they view the formulae of the semantic representation as 
trees and describe these trees. The role of dominance constraints in this context 
is to describe scope ambiguities; they are extended with constructions for de- 
scribing parallelism and anaphoric and variable binding to obtain the Constraint 
Language over Lambda Structures (CLLS) . 

Contribution. If CU and CLLS are used for the same application, an immediate 
question is if there is a formal relationship between the two languages that says 
something about their relative expressive power. 

In this paper, we show that the fragment of CLLS which provides dominance 
and parallelism constraints is equal in expressive power to context unification. 
We do this by giving satisfiability preserving, polynomial time encodings in both 
directions. The most interesting (and non-obvious) part of the construction is 
to encode dominance constraints in context unification. Once we know how to 
do that, the rest of this direction is easy. The inverse encoding can be deduced 
from a result in 1241 . 

Plan of the Paper. In SectionQwe illustrate why encoding dominance constraints 
into context unification is nontrivial. In Section 0 we recall the fundamental 
definitions of trees and contexts. These definitions are used in Section 0 where 
we present dominance and parallelism constraints and briefly review a linguistic 
example. They are also used in Section El where we recall context unification, 
discuss first results on its expressive power, and give a linguistic example, too. 
Section El contains the encoding of dominance and parallelism constraints in CU, 
and Section 0 the inverse encodings. We conclude in Section El 
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2 What Is the Problem? 



It is not obvious to encode dominance constraints in context unification. The 
problem is that both languages describe trees from different perspectives. We 
now illustrate the difference by an example. 

Two Perspectives on Trees. Dominance constraints (and CLLS as a whole) 
describe relations between the nodes of the same tree (or a more general A- 
structure). In contrast, the language of CU models relations between different 
trees and contexts. In CU, one cannot speak directly about the nodes of a tree; 
but we shall use contexts to speak about occurrences of subtrees later in this 
paper. 

The perspective taken when speaking about the nodes of the same tree is called 
internal in in contrast to the external perspective where one relates several 
trees. Both views have a long tradition in logics. The internal view is taken in 
modal logic and in second-order monadic logic (SnS) [2Yj . whereas the external 
view is popular in unification theory inisEi and for set constraints mm- In 
feature logics jlfilfifij . both perspectives have been employed and compared [21 



Dominance versus Subtree Constraints. Dominance constraints contain node 
valued variables that we write as capital letters X, Y, Z. An atomic dominance 
constraint X<\*Y holds in a tree (structure) if the node denoted by X is above 
(strictly or not) the node denoted by Y . 

A first idea for encoding dominance constraints in CU is to replace each atomic 
dominance constraint by a subtree constraint which can be expressed in CU 
in a very simple way. Subtree constraints have tree valued variables for which we 
use lower case letters x, y, z. A subtree constraint x:^y says that the denotation 
of y is a subtree of the denotation of x. 

Although they look very similar, there is an important difference between dom- 
inance and subtree constraints: Dominance constraints can speak about occur- 
rences of subtrees by specifying their root nodes, whereas subtree constraints 
can’t. 

An Example. Because of this difference, the naive encoding of dominance as 
subtree constraints does not preserve satisfiability. As an example, we consider 
the dominance constraint in (HJ and the “corresponding” subtree constraint (Ej) . 



ED. 




■•■y 



( 1 ) 

( 2 ) 



A:/(Ai, As) A Ai<*y A A2<*Y 
X=f{xi,X2) A ®l»l/ A X2^y 
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The dominance constraint o is depicted by the graph to the right. It describes 
trees in which the node denoted by X is labeled with a binary function symbol / 
and has two (distinct) children denoted by Xi and X 2 - Furthermore, it requires 
that there is a node, denoted by Y, which is below Xi and X 2 ■ This is impossible 
in a tree. Thus, m is unsatisfiable. 

The subtree constraint m requires that x, xi, X 2 , and y denote trees. The tree 
for X has two direct subtrees denoted by x\ and X 2 , which in turn have a common 
subtree y (not necessarily at the same position). The subtree constraint @ is 
satisfiable; one solution is obtained by mapping y, xi, and X 2 to the tree a, and 
X to the tree /(a, a). The two occurrences of y in the subtree constraint J2I) refer 
to different occurrences the tree a in /(a, a). 



3 Trees and Contexts 



Understanding the notions of trees and contexts is essential for this paper. We 
next define both notions and explain the views on them we will adopt. 

We assume a signature E of function symbols ranged over by /, g, each of which 
is equipped with a fixed arity ar(/) > 0. Constants, ranged over by a, b, are 
function symbols with arity 0. We assume that E contains at least two function 
symbols, one of which is not constant. Note that we do not restrict our signature 
to be finite. 



Trees. A (finite constructor) tree r is a ground term constructed from function 
symbols in E. For instance, f{f{a, b), c) is a tree whose root node is labeled with 
/ and which has three leaves labeled by a, b, c. 

An equivalent definition of trees, which makes the nodes and node labels of the 
tree explicit, is based on tree domains. Let IN be the set of natural numbers 
n > 1 and IM* the set of words over natural numbers, e is the empty word, and 
the concatenation of two words tt and tt' is written by juxtaposition tttt'. A path 
7t' is a prefix of tt if there is a tt" such that tt'tt" = tt. 

A tree domain D is & nonempty prefixed-closed subset of IN* . That is, D contains 
paths which are words of positive integers; they can intuitively be identified with 
the nodes of the tree. A labeling function is a function L : D ^ E defined 
on a tree domain D which satisfies for all tt G D and k G IN: irk G D iS 
1 < k < ar(L(7r)). A tree, then, is a pair (D,L) of a tree domain and a labeling 
function. 

The two definitions of trees can be connected by associating with each tree t a 
tree domain Dr and a labeling function Lr : Dr E as follows: 



D 






,.„) = {e} U {kTT I 1 < fc < n, TT G Dr^} 
' f if 7T = e 

Lr^{TT') if 7T = kir' , I < k < 



dW = 



tt' S Dr 



For instance, the tree r = f{g{a),b) has the tree domain Dr = {e, 1, 11, 2} and 
the labeling function Lr with Lr{e) = /, Lr{l) = g, Lr(U) = a, and Lr{2) = b. 
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Fig. 1. A context 7 with hole tto. 



Lemma 1. For every finite tree domain D and labeling function L : D ^ U 
there exists a unique tree r such that = D and Lt — L. 

Whenever t is a tree and tt a path in Dr, we define the subtree t.tt of r at tt as 
the unique tree with the following properties (otherwise t.tt is undefined): 

Dr.n = {tt' I TTTt' G Dr} 

Lr.Tr{T^') = Lr{TTTT') for all TTTt' G Dr 



Lemma 2. For all trees r and paths tt G Dr if f = Lr{Tr) and ar(/) = n then 
T.TT = /(r.( 7 rl), . . . ,T.(7rn)) . 

Contexts. Intuitively, a context is a tree with a hole. More formally, we introduce 
a special symbol • that we call hole marker and assign it the arity ar(*) = 0. A 
context 7 is a ground term over E U {•} which contains exactly one occurrence 
of the hole marker. For instance, /(a, /(•, b)) is a context, but /(•, /(•, b)) isn’t. 
We shall use the letter r for trees over E and the letter 7 for contexts (i.e. special 
trees over E U {•}). 

The hole of a context 7 is the occurrence of the hole marker in 7. More precisely, 
the hole is the unique path ttq G D^ such that L^(ttq) — •. Fig. Q shows a context 
with hole ttq. 

We will freely consider contexts as functions that map trees to trees. Application 
7[r] of a context 7 to a tree r is defined by 

7[r] = 7[r/*] 

That is, 7[r] is the result of substituting the hole marker • in 7 by r. The 
context • corresponds to the identity function on trees. This illustrates that the 
hole marker can be seen as a A-bound variable (rather than a constant or a free 
variable). Concatenation 707' of contexts seen as functions can be defined as 
7 [77#]. 

Lemma 3. For a context 7 with hole tt and all trees t, it holds that ')[t\.tt = t. 
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Contexts in Trees. Since contexts are ground terms over a special signature, we 
have already defined subtree selection for contexts. If 7 is a context and tt £ D.y 
then 7.7T is either a tree over 27 or a context. It is a context iff tt is a prefix 
(proper or not) of the hole of 7. 

Given a tree r and a path tt £ Dr, we write the context obtained by replacing 
the subtree of r at tt with the symbol • as r*7r. More precisely, t'tt is defined 
as the context with domain \ not a proper prefix of tt'} and the 

labeling function which assigns Lr’niT^') = Lr{x') for all tt' £ Dr’Tr \ {tt} and 

Lr*7z — *■ 

Lemma 4. For all r and tt £ Dr it holds that r*7r[r.7r] = r. 

Given a prefix tti of 7T2 and a tree r with 7T2 £ Dr, we define to be the 

eontext of t between tti and tt 2 '. 

t^2 ~ = (T.7Ti)*7r where ttitt = 7T2. 

4 Dominance and Parallelism Constraints 

We now present the language of dominance and parallelism constraints which is 
a fragment of the constraint language over A-structures GLLS ^Dj. GLLS also 
has constructs for dealing with variable binding and anaphora, but we ignore 
these for the purpose of this paper. 

Our notion of dominance constraints differs slightly from the one used e.g. by 
Vijay-Shanker m; these languages are mostly based on feature trees as common 
in computational linguistics, whereas our trees are eonstructor trees. 



4.1 Tree Structures 

We first define tree structures, logical structures representing trees. Tree struc- 
tures fix the interpretation of a set of predicate symbols. Based on tree structures, 
we will define the syntax and semantics of our constraint language in the usual 
Tarskian style. 

We associate with every tree t a logical structure Ai'^, the tree structure of 
T. The domain of the tree structure coincides with the tree domain of r. 
Furthermore, Ai'^ provides interpretations for the binary relation symbol <*, a 4- 
ary relation symbol .j .^.j ., and a relation symbol :/ of arity ar(/) -|- 1 for every 
function symbol / £ 27. We use the same symbols for relations and relation 
symbols; there shouldn’t be any danger of confusion. For instance, we write 
7T<l*7r' in order to say that the relation <* holds for the pair (7r,7r'), whereas 
X<i*X' is an atomic constraint built from the relation symbol <* and variables 
X,X' . A relation symbol is generally interpreted by the relation of the same 
name. 

If / £ 27 and ar(/) = n, then the labeling relation 7 r:/(7Ti, . . . , 7r„) is true in Ai^ 
iff Lr { x ) = f and tt^ = ttz for all I < i < u. The dominance relation is 
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Fig. 2. The dominance constraint X:g(Y) A Y <\* Z and of one of its solntions. 



true in iff tt, tt' G and tt is a prefix of tt'. Finally, the parallelism relation 
holds if the contexts and exist and coincide: 

holds in AF" iff 7r2<*7r2, and . 

Intuitively, this means the subtrees of r below tti and 7T2 have the same structure, 
except for the subtrees below and tt^, which may be different. 

4.2 The Constraint Language 

We assume an infinite set of node variables X, T, Z . A parallelism constraint ip 
is given by the following abstract syntax: 

p::=X<^*Y I A:/(Ai,... ,X„) | A/A'^F/W \ p A p' 

A parallelism constraint is a conjunction of atomic constraints for the domi- 
nance, labeling, and parallelism relations. A dominance constraint is parallelism 
constraint without atomic constraints Xj X' jY' for parallelism. 

The semantics of parallelism constraints is given by interpretation over arbitrary 
tree structures JyY . A solution of a parallelism constraint p consists of a tree 
structure and a variable assignment a into its domain that satisfies all 
atomic constraints in p. We write \= p ii ,a) is a solution of p. 

Note that the constraint X:a A Y:a has solutions where A and Y denote distinct 
nodes both of which are labeled with a. 

We often display a dominance constraint and its solutions graphically. For in- 
stance, the constraint X:g(Y) A Y<J*Z and one of its solutions are displayed in 
Fig.O Note that additional material (printed in light gray) has been filled into 
the space between the nodes denoted by Y and Z and above A. The dominance 
constraint does not say anything about these regions. 

The careful reader might have noticed that dominance constraints can be ex- 
pressed by parallelism constraints without atomic constraints for dominance. 



Lemma 5. The equivalence A<|*F ^ XjY^XjY is valid in all tree structures. 
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Xi:Vu(X3) A A3:-^(X4, Xg)A 

X4:man(X6) A Xe:var„A 

X2:3v(X 7) A X7: A (Xs, Xg)A man 

X8:woman(Xio) A Xio:var„A 

X5<*Xii A XgO^XiiA 

Xii:love(Xi2, Xi3)A 

Xi2:var„ A Xi3:var„ 




Fig. 3. An underspecified representation of the meaning of Example 0 



4.3 Application to Semantic Underspecification 

As examples for the linguistic application of dominance and parallelism con- 
straints, we briefly review a scope ambiguity and a very simple VP ellipsis. For 
the first example, consider the sentence ©, which is a classical scope ambiguity. 

(3) Every man likes a woman. 

The readings of this sentence can be represented by the predicate logic formulae 
in and 

(4) \fu.{man(u) — )> 3v .{woman{v) A love{u,v))) 

(5) 3v.{woman{v) A\/u.{man{u) -A love{u,v))) 

A compact underspecified representation of both readings is given by the dom- 
inance constraint in Fig.El The semantic representation of the sentence is con- 
sidered as a tree, which is then described by a dominance constraint. 

Ellipses can be modeled with parallelism constraints expressing that the trees 
corresponding to the semantics of source and target sentences must be the same 
except for the respective parallel elements. For instance, the semantics of o can 
be described by ( 0 ). 

(6) John sleeps. Mary does too. 

(7) X:sleep(X') A X':john A V':mary A X/X'-V/F' 

We cannot go into this in more detail here and refer the reader to PH for an 
in-depth discussion (in particular on the interaction of scope and ellipses). 

5 Context Unification 

Context unification is the problem of solving equations between tree valued 
terms in the two-sorted algebra TC of trees and contexts. We first introduce 
equations between tree-valued terms and then show that they can also express 
equations between context-valued terms. Finally, we sketch an application to 
semantic underspeciflcation. 
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X 1-^ a), a)), 

x=g{y) y ^ f(f{b,a),a), 

A y=C{z) C ^ f{f{b. •),«), 



Fig. 4. The equation system x=g{y) A y=C{z) and one of its solutions. 



5.1 Syntax and Semantics of CU 

The algebra of trees and contexts TC over If is a two-sorted algebra whose 
domains are the set of trees and the set of contexts over E. The operations 
provided by TC are tree construction and functional application of contexts to 
trees. Each function symbol f G E is interpreted as an ar(/)-ary tree constructor, 
which maps a tuple (ti, . . . , Tn) of trees to the tree /(ti, . . . , t„). The application 
7[r] of a context 7 to a tree r has already been defined. 

For both sorts of TC, we assume an infinite set of variables: tree variables x, y, z 
and context variables C. A tree-valued term t is built from tree variables, appli- 
cations of function symbols in E, and application of context variables. 

t ::= x\ /(ti, . . . ,t„) I C{t) (ar(/) = n) 

In particular, every tree is a tree-valued term. 

A variable assignment into 'TC is & function /3 that assign trees to tree variables 
and contexts to context variables. Variable assignments can be lifted homomor- 
phically to tree- valued terms: 

Pifiti, , f„)) = /(/ 3 (ti), . . . , / 3 (t„)) 

PiCit)) = PiCmt)]. 

A variable assignment /3 into TC is & solution of an equation system (i.e. a 
conjunction of equations between terms) if / 3 (f) = P{t') holds for all equations 
t = t' in this system. Context unification is the problem of solving such equation 
systems over TC. An example for a solution of the equation system x=g{y) A 
y=C{z) is given in Fig. E] The similarity between Figures O and E| is intended. 

5.2 Properties of Contexts 

The following three lemmas are quite simple, but will facilitate a lot of later 
work. 

Lemma 6. Two contexts 7 and 7' are equal iff their holes are the same and 
there is a tree r such that 7[r] = 7 '[t]. 

Note that the existence of a single tree r such that 7[r] = 7 '[t] is sufficient. 
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Proof. The “=>” direction is trivial. For the other direction, all we have to prove 
is that the domains of the contexts 7 and 7' are equal; this immediately implies 
equality of the labeling functions, since for every tt G Dj except for the (common) 
hole, = L^[r]{TT) = = Ly{Tr). 

Let’s say that the common hole of 7 and 7' is ttq. Then 7[r] = 7 ^[t] implies the 
following equalities: 

P) ^ U ttq t U ttq P) q. . 

As the unions on both sides are between sets whose respective intersection is 
{tto}, it follows that D.y = Dy. □ 

We next express a correspondence between nodes and their contexts. 

Lemma 7 . Pet t he a tree and tti a prefix of tt 2 with G Dt. Then is 

the unique eontext sueh that t.tti = ['''■'^2] ■ 

Proof. From Lemma E| it follows that t.tti = rif^[T.Tr2]. The uniqueness of 
follows from Lemma El D 



Lemma 8. Pet tti be a prefix o/7 T2, 7T2 a prefix ofir^, and t a tree whose domain 
contains tti, 7 T2, and ir^. Then o ^ 

Proof. Straightforward. □ 

5.3 Equations between Context- Valued Terms 

In the construction in the next section, it will be convenient to use equations 
between context- valued terms, such as C = C10C2. This notation emphasizes the 
functional character of contexts. In this section, we show that these equations can 
in fact be expressed by equations between tree-valued terms. A context-valued 
term u has the following abstract syntax: 

u::=C \ • I /(ti,... ,U,u,ti+i ... ,t„) I uou' 

We conservatively extend TC by concatenation 707' of contexts and lift variable 
assignments f) to context-valued terms as follows. As above, we define that (3 is 
a solution of an equation u=u' iff it maps u and u' to the same context. 

/?(•) = • 

P{f{ti, ... ,u,... , t„)) = f{P{ti), ... , ( 3 {u), ... , I3{tn)) 

( 3 {u o u') = ( 3 {u) o ( 3 {u') 

Now we can define syntactic insertion u[f\ of tree- valued into context- valued 
terms in the obvious way. This produces tree-valued terms with the property 
fl{u[t]) = f 3 {u)[P{t)]. With this operation, we can express each equation between 
context-valued terms as a conjunction of equations between tree-valued terms. 
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Proposition 1 (Equations between context valued terms). Let t \ and 

T2 be two different trees. Then the following equivalenee holds: 

U=U O u[ti\=u[ti] t\ u[t2]=u[t2\. 

Note that our restriction on the signature in Section^ implies that two different 
trees really exist. 

Proof. The direction from left to right is trivial. For the right-to-left direction, 
we show that the contexts 7 and 7' denoted by u and u' must be equal. To this 
end, we only need to show that 7 and 7' have the same hole; then their equality 
follows from Lemma El 

Let’s say that tt and tt' are the holes of 7 and 7', respectively. The path tt cannot 
be a proper prefix of tt' or vice versa. Otherwise, 7 [ti] = 7'[ti] would not be 
satisfied. Since tt' S either tt' G D.y, or tt is a proper prefix of tt'. But tt 

is no prefix (proper or not) of tt', so tt' S D.y. As tt' and tt are not a prefix of 
each other, it follows that 7[ri].7r'=7[T2].7r'. Hence, by LemmaEI 

7[n].7r' = 7 '[ti].7t' = n 
7[r2].7r' = 7 '[t2].7t' = T 2 . 

So in contradiction to our assumptions, we have derived that ti = T2. □ 

5.4 Application to Semantic Underspecification 

It is quite simple to express a scope ambiguity by using equations between 
context-valued terms. An underspecified representation of the meaning of Ex- 
ample 0 is given below. 



XT = C'i(love(var„, var^)) 

Cl = C'2(Vw(-;>(man(var„),C'3))) 

Cl = C4(3?;(A(woman(var„), C5))) 

The semantics of the whole sentence is represented by the tree denoted by xt 
in solutions of the above equations. The first equation states that the semantic 
description contains a description of the semantics of the verb love. The context 
of the verb semantics is denoted by Ci. The second equation requires that a 
quantifier every man is placed within the context denoted by Ci, i.e. above the 
verb. The third equation states that another quantifier a woman has also be 
placed above the verb. 

6 Parallelism Constraints into Context Unification 

In this section, we encode parallelism constraints (and thus dominance con- 
straints) into context unification. More precisely, we show that for every paral- 
lelism constraint ip, there is an equation system |<p] in the language of context 
unification with the same solutions (up to a simple correspondence). We freely 
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lX<i*Yj^ = 3C{CxoC = Cy) (C fresh) 



IA/x'~y/y']^ 

1‘Pl A V52]p 



Ai<i<„ C'xi = Cx o f{xi, ,Xn) if (n > 1) 

x=a 



3C{Cx'=Cx oC A Cy'=Cy o C) (C fresh) 
blip A [(P2]p 



Fig. 5. Pre-encoding of dominance and parallelism constraints. 



use equations between context-valued terms, which is safe according to Propo- 
sition Q 

We will proceed as follows: First, we define the encoding and consider some 
examples. Second, we lift the encoding to the first-order theory of parallelism 
constraints and prove its correctness. 

For the proof, we will relate every solution (A4^,a) of a parallelism constraint 
to a variable assignment |Af^,o:]] into TC which solves the encoded constraint. 
With this terminology, the key result (Proposition El of our correctness proof 
(which makes the term “have the same solutions” precise) can be stated like this: 
For an arbitrary dominance constraint ip and its encoding as a CU equation 
system, the following equivalence holds. 

(7W”, a)'^ip^ TC, [[TW, a] ^ [[(/j] 

As illustrated in Section |2l the main obstacle that we must overcome in our 
encoding of dominance constraints is to provide the power to talk about oc- 
currences of subtrees. The central idea is to talk about nodes (occurrences of 
subtrees) by talking about their contexts. For instance, the two occurrences of 
a in the term /(a, a) can be specified by the contexts represented by /(a, •) and 
/(•,a) respectively. 



6.1 The Encoding 

Let us define the encoding of a parallelism constraint <p. We associate with every 
variable X appearing ra&ip & context variable Cx (whose purpose it is to denote 
the context starting at the root of the tree and whose hole is the node denoted 
by X) and a tree variable x (whose purpose it is to denote the tree below X). 
In addition, we introduce a new tree variable xt that we want to denote the 
entire tree. To ensure that these new variables interact correctly, we impose the 
following constraint, Root((/?), where J-V{ip) are the free variables of ip: 

Root(v5) = f\ XT=Cx{x) 
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In addition, we define a pre-encoding |-]p as in FigureO The complete encoding 
I • ]] is obtained as 

M = Mp'^Root((/j). 

An atomic dominance constraint X<\*X' is pre-encoded by 3C{Cx'=Cx °C), 
which expresses that the context of the node X can be enlarged by adding 
more material below its hole to obtain the context X' . An atomic parallelism 
constraint XjX' jY' is pre-encoded by 3C{Cx oC = Cx' A Cy oC = Cy), 
which expresses that the context of the node X can be enlarged to the context 
X' by adding the same material as for enlarging the context of Y to that of Y'. 
The pre-encoding of X:f{Xi, . . . , A„) requires for alll < i < n that the context 
above Xi is the context above X, enlarged with f{xi, ... , Xn)^ where the 

hole is at position i. For a nullary labeling constraint X:a, the pre-encoding 
requires x = a. 

Proposition 2 (Encoding Parallelism Constraints). A parallelism con- 
straint if is satisfiable iff its encoding Root(ip) A |(/3]p is a satisfiable equation 
system of context unification. 

Proof. The proposition will be a simple consequence of Theorem 0 the analogous 
result for first-order formulae. □ 



6.2 Examples 

Before we turn to the first-order case, let us consider some examples. First, we 
reconsider Example © from Section 13 When we tried to encode this dominance 
constraint as a subtree constraint ( 0 , we lost unsatisfiability. However, our new 
encoding works just fine. © shows the pre-encoding of the example; we have 
left the Root formula away, as it is not necessary for the unsatisfiability in this 
case. 

© A : /(Ai,A 2) A A A2<*r 

(8) Cx,=Cx o /(., X2) A Cx,=Cx o /(xi, .) A Cx, o C=Cy A Cx, o C'=Cy 

We can see that m is unsatisfiable in the following way. As Cx, o C = Cy and 
Cx, ° C'=Cy, Cx, o C=Cx, o C . In this equation, we can substitute Cx, by 
Cx o /(•, X 2 ) and Cx, by Cx ° f{xi,») and obtain /(•, X 2 ) o C=/(xi, •) o C", 
which is clearly unsatisfiable because the holes are different on both sides. 
Another example will serve to show that the Root formula is really necessary to 
obtain the correct results. m is the (complete) encoding of the (unsatisfiable) 
dominance constraint © (a and b are different constants): 

(9) X:a AY:b A X<l*Y 

(10) xt=Cx{x) a xy=Cy{y) A x=a A y=b A Cx o C = Cy 

The pre-encoding alone (i.e. the last three conjuncts) is satisfiable; together 
with the Root formula, it isn’t. xt=Cx{x) A xy=Cy{y) implies Cx{x)=Cy{y), 
which, when combined with Cx o C = Cy, yields x=C{y). When using x=a A 
y=b as a substitution, we obtain a=C(b), which is not satisfiable. 
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bl = Mp (<P atomic) 



I<?i A $ 2 } = A l$2\ 

[-,<51 = ^[<5] 

pA.<5I = 3Cx^x.{xT=Cx{x)^m) 



Fig. 6. Encoding closed first-order formulas over parallelism constraints. 



6.3 Encoding First-Order Formulae 

In Fig. El the encoding of parallelism constraints is lifted to first-order formulae 
<5. If we restrict ourselves to closed first-order formulae, an explicit Root formula 
is no longer needed; its components are distributed among the encodings of 
existential quantifiers. If we write for the existential closure of a formula 
then it holds for all dominance constraints (f that: 

3 (Root((^) A Mp) = pv3] 

Hence, the correctness of the encoding Root((/j) A claimed in Prop, ^follows 
from the correctness of the encoding of first-order sentences. 

Now let us turn to the proof of the first-order case. First, we formulate the 
correspondence [• I "Iv we announced above. This function maps pairs of tree 
structures and variable assignments mapping the variables in V to the do- 
main of r to variable assignments into TC. The goal is that if the arguments 
satisfy a given dominance constraint, the result will satisfy its encoding. 

lM'^,a\y{xx) =T 

IM'^ ,aly{x) = T.a{X) for all x such that X gV 
,a\y\Cx) = '^a(x) such that X gV. 

With this definition, the following proposition holds. 

Proposition 3. Let be a tree strueture, a a variable assignment, and 
a first-order formula over the parallelism eonstraints. Then <L> is satisfied by 
iff [^1 is satisfied by [Ad^, . 

Proof. We prove the proposition by structural induction. First, we show that it 
is true for the atomic constraints; towards the end of the proof, we conduct the 
induction steps. Throughout the proof, we write /3 = [Ad"^, for brevity. 

— X<\*Y. The treatment of XjX'^YjY' is analogous. 

“=>” Assume that (Ad'^,a) satisfies X<\*Y] we show that j3 satisfies the en- 
coding 3C{Cx o C=Cy). Our assumption means that a{X) is a prefix 
of a{Y). Hence, we can construct a variable assignment fS' that is like (3, 
but assigns T^^y) C. By Lemma El (3' solves Cx ° C=Cy and, thus, 
/3 is a solution of |X<l*F]. 
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Assume that 3C{Cx ° C=Cy) is satisfied by f}. Then there must be a 
context 7 such that (}{Cx) 07 = /3(Cy); hence, a{X) must be a prefix 
of a{Y), and satisfies X<i*Y. 

— X:f{Xi, . . . , Xn), where n > 1 

“=>” Assume that satisfies X:f{Xi,... ,A„); we assume 1 < i < n 

and conclude that /3 satisfies all equations Cxi=Cx°f{xi, ... , x„) 

where the hole marker • is at position i. 

Let u be the context-valued term Cx o /(xi, ... ,x„). We first 

show that the holes of /3(u) and P(Cxi) are the same, and then that 
their values on f3{xi) are equal. (Here we need n > 1, as Xi would not 
exist otherwise.) From Lemma El we can then conclude (3{u) = j3{Cxi)- 
The hole of (i{Cxi) = Ta(Xi) <a(Ai), and that of P{u) is a{X)i. Since 
(A4'^,a) is a solution of X:f{Xi, . . . ,X„), we have a{X)i = a{Xi), and 
hence the holes are equal. 

We already noticed that a{X)i = a{Xi) for all 1 < f < n. Lemma 0 
implies that 



l3{x) = T.a{X) = /(r.a(A)l, . . . , T.a{X)n) 

= f lr-alXi), . . . ,r.a(X„)) 

= fiPixi),--- ,(i{x„)) 

Based on this equation and LemmaQ we are now in the position to prove 
/3(u)(/3(xi)) = f3{Cxi)iP{xi)) (and thus f3{u) = P{Cxi) as required): 

(3{u){(3{xi)) = T^(^^^[f{f3{xi),... ,(3{xn)] = [/3(a:)] = r 

PiCxJiPixi)) = T^(x7[/3(a;*)] = T 

Assume that [3 solves the equation Cx = Cjc, ° f{xi, ... ,*,... , Xn) for 
some 1 < i < n, where the hole • is at position i. Lemma Q yields 

= PiCx)[f3{x)] 

= '^a(Xi)[^-“(^»)] = /3(C'xJ[/3(a;»)] = P{Cx)[f{f3{Xl),... ,P{Xn))] 

Since context functions are one-to-one and j3{Cx) is a context function, 
these equations imply (3{X) = f{(3{xi), . . . ,(3{xn))- This is equivalent to 

T.a{X) = f{T.a{Xi), , r.a(A„)), 

which in turn means that (Af^, a) solves X:f{Xi , . . . , X„). 

- X:a 

“=>” Assume that (Xi'^,a) satisfies X:a. Since ar(a) = 0, it follows that 
T.a{X) = a, so P solves x=a. 

“<^=” Assume that /3 satisfies x=a] then /3(x) = T.a{X) = a and, hence, 
(A4'^,a) solves X:a. 



Of the complex cases, negation and conjunction are trivial. Existential quantifi- 
cation is more interesting: 
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- 3X.<P 

“=>” We assume that (A4'^,a) satisfies 3X.<P-, so there is a path tt such that 
{M.'^ , aliT / X]) solves By induction hypothesis, a[ 7 r/X]]^y^^j 

satisfies [[<?]. On the free variables of <P, this variable assignment agrees 
with t^/Cx], and the latter variable assignment 

satisfies xx = Cx{x) as well. Thus solves 

“4=” We assume that j3 = <al^v(ax<&) solves 3Cx'^x{\^\ A xt=Cx{x)). 

There is a tt such that /3[t.7t/x, t^] solves |^J. Since (3[t.'k/x,t^/Cx\ is 
equal to [[Ad'^, on all free variables of |<?], it follows from 
the induction hypothesis that (Ad"^, ap/X]) solves <1>. Hence (Ad"^,a) 
solves 3X<1>. □ 



Corollary 1 (Encoding First-Order Formulae). A closed first-order for- 
mula <P over dominance and parallelism constraints is satisfied by a pair (Ad"^, a) 
iff there is a variable assignment (3 into TC that solves |<?] such that /3(xt) = t. 



7 Context Unification into Parallelism Constraints 

We finally show how to express equations of context unification by parallelism 
constraints. This is not obvious but it follows from a result of m which shows 
that CU has the same expressive power as equality up-to constraints. Equality 
up-to constraints can be translated to parallelism constraints plus similarity 
constraints. Finally, one can get rid of similarity constraints by a neat trick. 

An equality up-to constraint is a conjunction of atomic constraints of the fol- 
lowing form, which are interpreted in the algebra TC. 

if ::= x/x'=y/y' \x=f{xi,... ,Xn) \if Alp' 

An atomic equality up-to constraint xfx'=yfy' is satisfied by a variable as- 
signment (3 into TC if there is a context 7 such that [3{x) = "f[f3{x')] and 
[3{y) = "f[f3{y')]. Intuitively, this is the case iff the trees denoted by x and y 
are equal, up to an occurrence of x' in x and of y' in y respectively. Equality 
up-to constraints are equivalent to context unification: 

Proposition 4 (Equality up-to Constraints and CU [24J i . For every equa- 
tion system of context unification, there is a satisfaction equivalent equality up-to 
constraint, and vice versa. 

With this result, it remains to encode equality up-to constraints into parallelism 
constraints. This would be simple if parallelism constraints could express simi- 
larity constraints. A similarity constraint has the form X'^Y and is interpreted 
by the similarity relation. A similarity relationship holds for two nodes if 

r.TT = t.tt' (i.e. if the s ubtrees below tt and tt' are the same). 

Lemma 9. If the signature S contains a single constant, then parallelism con- 
straints can express similarity constraints. 
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lx/x'=y/y'j ^ 


= X/X"^YIY” A X'^X" A Y'^Y" 


{X”,Y" fresh) 


lx=f{xi,... ,a:n)r^ 


= X:f{X'„... ,Xf)A^f^,Xi^X', 


(x;,... ,x; fresh) 



Fig. 7. Encoding eqnality up-to into parallelism and similarity constraints. 



Proof. Let a be the unique constant of S. Every finite tree must contain a node 
labeled with a; so the following equivalence holds for all tree models: 

X^Y ^ 3Z3Z'{Z:a A Z':a A X/Zr^Y/Z') □ 

If the number of constants in S is finite, we can express X'^Y by a finite dis- 
junction; but this would not lead to a polynomial time transformation. But there 
is a neat trick to work around which even applies for infinitely many constants. 

Lemma 10. For every signature S, there exists a signature S' with a single 
eonstant such that parallelism and similarity constraints over S can be translated 
in linear time into satisfiability equivalent constraints of the same kind over S' . 

Proof. For any signature S, let S' be the signature consisting of all non-constant 
symbols of S, plus the constants of S considered as unary function symbols, plus 
a new constant a. We transform each parallelism constraint ip into a constraint 
(fi' by replacing every constraint X:b by 3Y {X-.b{Y) A Y-.a). Now it is easy to 
see that ip is satisfiable over S iff p' is satisfiable over S'. □ 

Theorem 1 (Parallelism Constraints = Context Unification). For every 
parallelism constraint p, there is a satisfiability equivalent equation system of 
context unification, and vice versa. 

Proof. The correctness of an encoding of parallelism constraints into CU is stated 
in Proposition El 

For the converse, we first express CU by equality up-to constraints according to 
Proposition El Second, we encode equality up-to constraints by parallelism and 
similarity constraints. This is quite easy; an encoding |^] ^ is defined in Figure 
Q In order to encode if}, we assume a node variable X for every tree variable x 
occurring in ip. The variable X is supposed to denote the root node of an occur- 
rence of X in the solution of the encoding of It is obvious that [[•] ^ preserves 
satisfiability. The encoding of xfx'=yfy' expresses that somewhere below the 
nodes X and Y, there are nodes X" and Y" the trees below which look just 
like the trees below the nodes X' and Y' , and the contexts between X and X" 
and Y and Y" are equal. (Note that this is a weaker condition than parallelism 
itself; it does not say anything about the locations of the nodes denoted by X' 
and Y'.) The encoding of equation x=f{x \, . . . , Xn) works similarly: It expresses 
that X is labeled with / and that its subtrees look just like the subtrees below 
the X\ , ... , X,2 ■ 
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Third, we switch to a signature with a single constant which we can do accord- 
ing to Tjemma, ! 1 1)1 We can now express all similarity constraints by parallelism 
constraints (Lemma 0 which completes the proof. □ 

8 Conclusion 

The main result of this paper is that context unification has the same expressive 
power as parallelism constraints. Parallelism constraints subsume dominance 
constraints. The most involved part was to embed dominance constraints into 
CU. The inverse direction from CU to parallelism constraints proceeds via a 
deviation through equality up-to constraints, which have the same expressivess 
as CU as well. 

The correspondence between CU and CLLS has two important consequences. 
For one, it allows us to transfer complexity and decidability results. For the time 
being, however, the decidability of either language is unknown. Conversely, the 
satisfiability problem of dominance constraints is shown NP-complete in • Of 
course, NP-hardness for several fragments of CU was well known before. 

The other consequence is that CU can be easily expressed by parallelism con- 
straints in CLLS m which explains why the linguistic application given for CU 
in carries over to CLLS. Furthermore, this application of CU is clarified. In 
earlier papers, scope ambiguities could be described in CU but only in a some- 
what intransparent fashion. In the light of the results presented, it becomes clear 
that the equations used previously were really just encodings of dominance and 
parallelism constraints. 
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Abstract. Since Peter Aczel’s theory[P of hypersets, many applications 
to formalizations of a circular object such as a mutual belief has been pro- 
posed I32E]- This paper will propose Membership Description Systems 
(MDSs), a partial revision system of circular objects and their applica- 
tions to a dynamic semantics of a dialogue in the sense that a dialogue 
can be considered as a revision process of mutual beliefs between its 
agents. Although usual dynamic semantics m updates a variable as- 
signment or a set of information states, our proposal of a semantics of 
dialogues directly updates situations, which is specified by MDSs, as dy- 
namic semantics of circular propositions im directly updates situations. 
Furthermore, using MDSs as updated objects in the semantics makes a 
partial and direct update of circular situations themselves possible. As a 
result, a dynamic semantics of a language with the 4,-operator, which is 
introduced by P| to describe circular propositions, can be provided. 



1 Introduction 

This paper will propose Membership Description Systems (MDSs) a formal sys- 
tem of revision of some types of an infinitely structured object, called a circular 
object, which can be a formal model of liar sentences |3|, mutual beliefs, shared 
information, and common knowledge |2| . They can be a basis of dynamic seman- 
tics of a dialogue in the sense that a dialogue can be considered as a revision 
process of mutual beliefs between its agents. 

MDSs are theoretically based on the framework of circular objects proposed 
in EEEI, which exploits hyperset theory HE] in which we can define circular sets 
as a unique solution to equational system of sets. Although usual dynamic se- 
mantics jl Pit) updates variable assignments or states as a set of possible variable 
assignments, a semantics based on MDSs updates situations directly as dynamic 
semantics im, and furthermore circular situations are also revised directly in 
the semantics. 

Section 2 is a brief introduction of some basic concepts about circular objects 
and hyperset theory. Section 3 is the definition of a language for description 
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of mutual beliefs and its two types of dynamic semantics which cannot treat 
direct update of circular situations. Section 4 is the definition of MDSs, their 
relevant concepts and a semantics based on MDSs which can treat direct update 
of circular situations. 

2 Theory of Circular Objects Based on Hyperset Theory 

2.1 Circular Objects and Their Revisions 

We can find some types of infinite objects as follows. 

(1) a. /(/(/(/(...)))) 

b. /(o,/(a,/(a,/(a,...)))) 

c- /(a, /(/(a), /(/(/(a))), /(/(/(/(a)))), . . .)))) 

d- /(a, /(/(/(a)), /(/(/( 5 (a)), /(a)), . . .))) 

(il a,il - (^lcjl can be recognized as regular, since we can find some regularity to 
predict further detailed structures of them, while (d can not be recognized as 
regular from this sequence. Furthermore, lltallbll are simple repetitions, while 
(© is not a simple repetition. We say that an object is a circular objec10 if it 
has simple repetitive structure as fll bli . For example, (0, namely a mutual belief 
between Max and Claire, are circular objects which we can find in the real world. 

(2) Max believes that Max has the ace of hearts, 

Claire believes that Max has the ace of hearts. 

Max believes that Max believes that Max has the ace of hearts and that 
Claire believes that Max has the ace of hearts, 

Claire believes that Max believes that Max has the ace of hearts and that 
Claire believes that Max has the ace of hearts, and ... 

We mean by revisions of circular objects such a change from (j1 all to (II hji . or from 
(LUJ) to (|T^ . Therefore, mutual belief revisions by dialogues are also considered 
as revisions of circular objects. 

2.2 Hyperset Theory 

The term ‘hyperset theory’ is informal. We mean by the term a set theory which 
has the Anti-Foundation Axiom (AFA) instead of the Foundation Axiom. AFA 
has many formalizations m, for example: 

— (i) Every flat systems of equations {X, A, e) has a unique solution; 

— (ii) Every graph G over A has a unique decoration; 

— (iii) For every proper substitution e there is a unique proper substitution s 
such that s = s-k e. 

The most remarkable characteristic of hyperset theory is that hyperset theory 
admits circular sets such as x G x. 

This paper concentrates on the formalization using equational system of sets 
which will be introduced in the next subsection, in detail. 



^ ini call it as a regular tree. 
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2.3 Equational Systems of Sets 

A circular set such as a; = {a;, a} is specified by the notion of equational systems 
of sets, defined as follows. 

Definition 1 (Barwise &: Moss 

1. A fiat equational system of sets is a triple £ — (X, A, e), where X and A are 
sets of urelements such that X (lA = 0, and a function e : A — >■ pow{X U A) . 

2. X is called the set of indeterminates of £ , and A is called the set of atoms 
of£. 

3. A solution to £ is a function 9 with domain X satisfying 6{x) = {9{y)\y € 
e(x) n X} U (e{x) fl A), for each x G X. 

If e is a function from indeterminates X to pow{AUX), then the system is called 
flat, while if e is a function from indeterminates X to any set constructed from 
A and X basically, the system is called general. For example, {a; = (a, a:)} is 
general, but one of its equivalents {a; = {y, z}, y = {o}, z = {a, a;}} is fiat. 

Using the concept of fiat equational systems of sets, we can state a form of 
Anti-Foundation Axiom as follows: 

Anti-Foundation Axiom: Every fiat equational system of sets has a 
unique solution 9. 

Let solution — set{£) be the set {9{x)\x G X} where £ = (X,A,e). We can 
define the hyperuniverse VafJfA] as follows. 

Vafa\U] = [J{solution — set(£)\£ is a flat equational system of sets with atoms A C 
U}. 

3 A Language of Mutual Belief and Its Dynamic 
Semantics 

Now we define a minimal language £ in order to describe mutual beliefs, con- 
sisting of each sentence defined as follows. 

::= has{a,c)\Bel{a, (p)\Bel{a,v A <p)\<pi A v? 2 | i vip, 

where c G C = {2*,...,A4} (a set of card symbols), v GVar {a set of sentence 
variables), and a G AG = {max, claire] (a set of agent symbols), f means 
that it is a unique solution to v = ip, i.e., the ^-operator indicates the scope of 
its circularity. 

Example 1. o amounts to sentence: 

(*)), i;.i3e/(Max, v A /ias(Max, A^) A BeZ(Claire, v A /ias(Max, Af?)). 

^ This is basically the same with the scope indicator of self-referential terms in the 
language of 0. 
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As a model of £, we define class SOA of states of affairs and class SIT of 
situations. 

Definition 2. Assume the Anti-Foundation Axiom. Let H = {((iJas, a, c), l)|a 
G A = {Claire, Max}, c € {2^, . . . , A4} be a set of states of affaires, and <P be 
a functor such that L>{X) = {{{Bel, a, s), l)|a G A, ,s C X} U H. SOA is the 
greatest fixed point ofF. If s C SOA, then s G SIT. 

<P has its greatest fixed point, since F is obviously monotone and the Knaster- 
Tarski Theorem assures the existence of the greatest fixed point and the least 
fixed point of monotone operators taking a complete lattice to a complete lattice. 

Theorem 1 (the Knaster- Tarski Theorem jTlTCZj i. Let £ = {L,C) be a 
complete lattice and F : L ^ L be a monotone function. Then F has the least 
fixed point lfg{F) and the greatest fixed point gfp{F). Furthermore, lfp{F) = 
n{^|F(A) C A} and gfp{F) = U{A|A C T^(A)}. 

We can model a mutual belief between Max and Claire (*) as b = {p, q}, 
where p = {Bel, Clair, b U {{Has, Plax, A‘T’;1)}'A) and q = {Bel, Max, bU 
{{Has,Max,A‘T’-, 1)}; 1), according to Fagin et al.|^ and Barwise ISfl . We will 
write (cr; 1) for ((ct), 1 ). 

Proposition 1. The mutual belief b defined in the above is a member of SIT. 

Proof. We have only to show p,q G SOA. Let X = SOA U {p,q}. Then 
’P{X) = {{Bel, a, s; l)|a G A, s C SOA U {p, 9 }} U H which includes {p, q} i.e., 
{{Bel, Claire, {p, q, {Has, Max, AC; 1)}; 1), {Bel, Max, {p, q, {Has, Max, AC; 
1)}; 1)}. That is, 

{p,q}Cd>{X) (1) 

Since SOA is a fixed point of <P, <P{SOA) = SOA. Then by monotonic- 
ity, <P{SOA) C <P{X). So SOA C L>{X). Hence, by (^, SOA U {p,q} C 
<P{SOA U {p,q}). But then by the property of greatest fixed points, for any 
A, if A C <L{X), then A C SOA, from which it follows that p,q G SOA. □ 



3.1 Dynamic Scope- Taking 

As the basic property of dynamic semantics, it admits dynamic scope-taking of 
an operator of the object language. In Dynamic Predicate Logic (DPL) jOj, the 
following equivalence is hold: 



(Bv.ipi) Aip 2 = {3v.ipi A 

since DPL’s semantics is defined as follows: for a set of assignments s, 

- |7r(ti, . . . ,t„)](s) = sn{g|7W \= Tr{ti, . . . ,tn)[g]}, 

- A ip2\ = bi] o IT 2 I, 

^ This modeling is classified to the Fixedpoint approach in Barwise’s term. 
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- = po;] o p], where px](s) = {h\g h,g e s}. 

We will define the notion of dynamic scope-taking as follows. 

Definition 3. An unary operator O takes its seope dynamically w.r.t. a binary 
operator P if P{0{xi),X2) = 0{P{xi,X2))- 

Corollary 1. In DPL, 3v takes its scope dynamically w.r.t. A. 

3.2 ©i: A Semantics of £, Based on U-Based Updates 

Exploiting the basic idea of mi , we tentatively propose a dynamic semantics 
of £ based on U-based update of information states. 

Definition 4. For each ip G £,, an update function p] : SIT — >■ SIT is assigned 
in Si, where p] is defined by induction of the complexity of p as follows. 

— |/ias(a, c)](s) = s U {(i?as, a, c; 1)}, 

- lBel{a,p)]{s) = sLl{{Bel,a,lpf,l)}; 

- lBel{a, V A pl(s) = s U {{Bel, a, v U p]; 1)}; 

— \pi A pfl = pi] o P 2 I (sequential composition of functions); 

— p = V where v = p](s) and v is an indeterminate. 



Example 2. To sentence Q an update function G is assigned in Si, where for 
some s G SIT and G{s) is defined as follows: 

G(s) = [4, n.ReZ(Max, V A /ias(Max, Af?) A ReZ(Claire, u A /itts(Max, Af?))](s) 

= V, 

where 

V = |i?ei(MEix, 1! A /ias(Max, A^) A RedClaire, D A /ias(Max, A'iPjjKs) 

= |Rei(Claire, v A has(M.ax, A'?))](|i?edMax, v A ftas(Max, A^)](s)) 

= |i?ei(Claire, v A /ias(Max, A'iPjjKKBeZ, Max, v U {{Has, Max, A'?; 1)}; 1)} U s) 

= {{Bel, Claire, v U {{Has, Max, A'?; 1)}; 1), {Bel, Adax, v U {{Has, Max, A^; 1)}; 1)} U s. 

We call such a semantics U-based, defined as follows. 

Definition 5. A (dynamic) semantics S of a language L is a U-based w.r.t. X 
iff for any (non-negative) sentence p € L, S assigns it a function F : x xUy, 
for some x,y,xUy € X. 

Lemma 1. Si is U-based w.r.t. SIT. 

^ Gerbrandy and Groeneveld (1997) “Reasoning about Information change,” ms. pro- 
poses “Dynamic Epistemic Semantics” (DES). Although this framework is con- 
cerned with the change of common knowledge, this framework is not concerned 
about updates of circular objects directly. 
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Proof. (By induction of the complexity of sentences of £.) We have only to show 
that each sentence (p of form of A (/?2 and is assigned an update function 

such that for each s S SIT, s !->■ s U t for some t G SIT. Assume |(/?i](s) = t 
and (s) = u. 



Iti a <<521 (s) = Iv^il o b2](s) 
= bll O |v?2](s) 
= I</52l(bl](s)) 
= MlsUt) 

= s U (t U u), 



li m](s) = V 

V = I</5i](s) 

V = t U s. 



□ 

However we can not deal with dialogues o as shared belief revision by such a 
U-based update. 

(3) a. Claire: You have the ace of hearts; Max: Yes. (That’s right.) [Agreement] 
b. Max: I have the ace of hearts; Claire: Uh-huh. [Acknowledgement] 

can be considered as the formulas in (0J, respectively, where h = 
/ios(Max, A<|k) and f v.Bel{Claire,v A h) means Claire’s introspective belief 
about h. 

(4) a. {i v.Bel{C\aire,v A h)) A Bel{yiaK,v A h) 

= (4, v.Bel{C\a\ve, v Ah) A HeZ(Max, v A h)) 
b. (4, u.He/(Max, v A h)) A BeZ(Claire, v Ah) 

= (4, u.i?e/(Max, v Ah) A Bel{C\a\ve, v A h)) 

Namely, in ©i the 4--operator cannot take its scope dynamically w.r.t. A, since 
such dynamic scope-taking can not be dealt with by a simple U-based update as 
explained in OHl), where h' = {Has,Max,AT>] 1). 

(5) sLl{{Bel, Claire, vU{/i'}; 1), (He?, Maa;, vU{/i'}; 1)} yf sU{{Bel, Claire, vU 
{h'};l),{Bel,Max,v U {/i'};l)}, since in the left v = {{Bel, Max, v U 
{/I'l; 1)} but in the right v = {{Bel, Max, v LI {h'};l), {Bel, Claire, v LI 
{h'};l)}. 

From the viewpoint of equational systems, such a dynamism is considered as 
revision of the following revision of the equational systems. 



(6) a. {v = {{Bel, Claire, -V LI {h'};l)}} >->■ {v = {{Bel, Max, v LI {h'}) 
{Bel, Claire, v U {/i'})}} 
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b. {v = {{Bel, M ax, V yj {h'} ■,!)}} i->- {v = {{Bel,Max,v VJ [h'}),{Bel, 
Claire, V U {/i'})}} 

From this example, we can conclude the following proposition. 

Proposition 2. 

1. A semantics U-based w.r.t. SIT cannot assign a function which can revise 
circular situation to a sentence of its object language. 

2. A semantics U-based w.r.t. SIT cannot admit the {-operator’s dynamic 
scope-taking w.r.t. A. 

3.3 © 2 = A Substitution-Based Semantics of Z. 

According to jS|, a revision of circularity itself can be defined by a substitution 
which can be defined by corecursion. 

Definition 6 (Barwise & Moss O). A substitution is a function 9 whose 
domain is a set of urelements. A substitution operation is an operation sub 
whose domain consists of a class of pairs {9, b) where 9 is a substitution and 
b yjVafJfA], such that the following conditions are met. 

1. If X € dom{9), then sub{9,x) = 9{x). 

2. If X G U\dom{9), then sub{9,x) = x. 

3. For all sets b, sub{9,b) = {sub{9,a)\a S b}. 

0 has shown the existence and uniqueness of sub ^ (Theorem 8.1). 

Example 3. A substitution 9^ as the revision function required in o are defined 
by corecursion as follows: 

9b{u) = 9b{u) U {b}, 

for all indeterminate u. 

(7) [x = [a, x}} I— >■ {a; = {a, b, x}} 

That is, 9b{x) = 9t{x) U {6} = {9b{x),a} U {b} = {9b{x),a, b}. 

The existence and uniqueness of the solution to the equation 9b{x) = {9b{x), a, b} 
are guaranteed by the Anti-Foundation Axiom. 

Similarly, the required revision functions in m is defined by corecursion as 
follows. 

(8) Fa,p{x) = Fa,p{x) U {{Bel, a, Fa,p{x) U {p}; 1)} 

Dialogues m are interpreted as follows: 

FMax.h'ili v.Bel(Claire,v A /i)](s)) 

= Fuax.h'O U {(Bel,Max,FMax,h'(y) U {h'}- 1)} 

= FMax,h'{{{Bel, Claire, FMax.h'iy) U {h'}- 1)} U s) U {{Bel, Max, FMax.hfv) U [h'}-, 1)} 

= {{Bel, Claire, FMax.hfv) U {h'}-, 1)} U {FMax,h'{p)\p € s} U {{Bel,Max,FMax,h'{v) U {h'}; 1)} 
= {{Bel, Claire, FMax,h'{v) U {h'}; 1), {Bel, Max, FMax.h' {v) U {h'}-, 1)} U {FMax,h'{p)\P € s} 

= {{Bel, Claire, FMax.hfv) U {h'}; 1), {Bel, Max, FMax.hfv) U {h'}-, 1)} U {FMax,h'{p)\P € s} 
Fmxx.H' {^l 
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That is, FMax,h'iv) = {{Bel, Claire, FMax,h'{'v) U {h'};l), {Bel, Max, FMax,h' {v) U 
{h'}-, 1)} U {FMax,h'{p)\p e s}- 

Fciaire,h'iU- v.Bel{Max,v A /»)I(s)) 

= FciaiTe,h'(v) U {{Bel, Claire, Fciaire,h'(v) U {/i'}; 1)} 

= Fciaire,h'{{{Bel, Max, Fciairc,h'{v) U {h']\ 1)} U s) U {{Bel, Claire, Fciaire,h' {v) U {h'}; 1)} 

= {{Bel, Max, Fciairc,h'{v) U {h'}-, 1)} U {fciaire.fc' {p)\p e s} u {{Bel, Claire, fciaire.fc' (v) U {/i'}; 1)} 
= {{Bel, Max, Fciaire,h'{v) U {h'}-, 1), {Bel, Claire, Fciain,h' (v) U {/i'}; 1)} U {Fciaire,h'{p)\p £ s} 

Fciaire,h' (^) 

That is, Fciaire,h'{^) = {{Bel, Max, Fciaire,h' {^) 'j [h'}-,l), {Bel, Claire, Fciaire,h' i'v) 
1)} U {Fciaire,h'{p)\p £ s}. 

Although this result achieves the aim of providing the formal semantics of 
dialogues as mutual belief revisions, it can not grasp the dynamism of {, and A 
in 0), since the semantics gives substitution Fa,p to utterances like ‘Yes’ and 
‘Uh-huh’ directly and not to formulas. As the result, we must revise language £ 
itself by adding a new special formula, say, F(a, it), where a means the utterer 
of agreements or acknowledgements and it refers the agreed or acknowledged 
proposition. For example, dialogues (0 are expressed and the ^-operators in 
the sentences take their scope dynamically w.r.t. A as follows in this expanded 
language. 

(9) a. (4, u.i3eZ(Claire, u A h)) A F(Max, it) 

= (), v.Bel{Claire, v A h) A F(Max, it)) 
b. (), v.Bel{'M.ax, v A h)) A F(Claire, it) 

= (4, u.i?e/(Max, v Ah) A F(Claire, it)) 

But this dynamic scope-taking is special only for the predicate F, and generally, 
dynamic scope-taking as in Q) is not hold in this semantics. 

We can summarize this discussion as follows. 

Proposition 3. 

1. In 6 2 , 4- doesn’t take its seope dynamieally w.r.t. A. 

2. In &2 with predicate F, for some variable v, f v doesn’t take its scope dy- 
namically w.r.t. A. 



4 Revision Systems of Circular Objects Based on 
Membership Description Systems 

We investigate an alternative approach to circular objects which can grasp the 
dynamism of f and A, here. Our main idea is based on the fact that any set 
can be specified by membership relation as well as equational systems of sets, as 
follows: 

{x = {x}} ^ {x & x) A 'iy{y ^ x ^ y ^ x). 

Namely, we can construct the specification of a set based on membership relation, 
called a membership description system {MDS), from any equational system of 
sets as follows. 



Definition 7. 

1. An MDS V is a quadruple: 
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{X,A,e+,e~), 



where 

~ £■*■, e~ C pow{A U X) X X, and 
— e+ n e“ = 0. 

2. A solution to an MDS V is a function 9 with domain X satisfying 9{x) A 
{9(y)jye~^x, y G X}U {ajae+a;, a ^ A} — {9(y)jye~x, y G X}, for each x G X. 

We will also write an MDS as a set of propositions on membership, e.g., {xe^x} 
for {{x},0,{x,x),0). 

Furthermore, we will define some concepts on MDSs as follows. 

Definition 8. 

1. An MDS D = (AT, A, e+, e“) is a subMDS of V = (X',A,e'~^,e'~), written 
V \—V', if X C X', e+ C e'+ and e~ C e'“. V is an expansion ofV. If 
X = X' and D \—D' , D' is an extension ofV, written D G-D' . 

2. An MDS D = {X, A, e"*", e~) is partial if A e~ C pow{A U X) x X. D is 
complete if e+ U e“ = pow{A U X) x X. 

3. Let £ = (X, A, e) be an equational system of sets. Then the MDS con- 
structed from £ is an MDS: 



(X,A,e+,e-), 



where 

— xe'^y iff X G e{y), and 

- xe~y iff X ^ e{y), 
for all y G X . 

Proposition 4. 

1. If an MDS V is complete, there is a unique solution to D. 

2. An MDS T>^ constructed from an equational system of sets £ is complete. 

3. For an MDS D, there may he more than one solution to D. 

Proof. (1) Let V = (X, A, e+,e“) be a complete MDS. Let e be a function 
constructed by the following conditions: 

— xe~^y iff X S e(y), and 

- xe~y iff X ^ e(j/), 

for all y G X. That is, e{y) = {x|xe+y}. Then (X, xl,e) is obviously an equa- 
tional system of sets. By AFA, there is a unique solution to it. 

(2) Directly from definition of an MDS constructed from an equational system 
of sets and (1). 

(3) Let us consider an MDS {ye+x}, and functions 9o = {(x, {0o(y)})) 

^1 = {(a^>{6'i(a;),6'i(j/)}),(2/, {6'i(y)})}, and 02 = {(x,{6»2(x), 
02{y)}), {y, {92{x),92{y)})}. 9 q, 9\ and 02 are solutions to the MDS. □ 
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Therefore, an MDS has many solutions, since it can be an MDS of any of its 
expansions which are constructed from equational systems of sets. We call 9 
a minimal solution to an MDS T> = {X, A,e~^ ,e~) if 9 can be defined as a 
unique solution of the equational system of sets £ = (X,A,e), where e{y) = 
{x G X U A\x€^y} for each y G X. We call 9 a maximal solution to an MDS 
T> = {X,A,e^,e~) if 9 can be defined as a unique solution of the equational 
system of sets £ = (X,A,e), where e{y) = {x G X A A\ not xe~y} for each 
y G X. We choose a minimal solution to an MDS as its intended interpretation. 
If an MDS has no minimal solution, we call it undefined. Then we can consider 
a sequence of subMDSs T>q <G . . . (Z T>n as a revision process of circular objects. 

Example 4- The revision in (0 is specified as a simple adding operation {be~^x} 
to {xe'^x, ae^x}. 



4.1 © 3 : A Semantics Based on MDSs 

Now we will propose © 3 , a semantics of £ based on MDSs. In 63 , revisions of 
circular objects are considered as transitions of MDSs such that each of them 
are related by subMDS or expansion relations. For example, revisions in (0 are 
defined by the following expanding operations: 

(10) a. {{Bel, Claire, s;l)e~^v,h'€~^s,ve^s}LI {{Bel, Claire, s;l)e~^v}. 
b. {{Bel, Max, s; h'e~^s, 'ce“''s} U {{Bel, Claire, s; l)e+u}. 

Therefore, we can provide a natural semantics of 0 based on MDSs as follows. 

Definition 9. In © 3 , for each sentence (p G £,, f is assigned an update function 
: Context x Root x pow{MDS) — >■ Context x Root x pow{MDS), where 
Context is a set of set of assignments, Root is an indeterminate (if r is an 
indeterminate and a G AG, then a{r) is an indeterminate such that r ^ a{r) ), 
and \Lp\{C,r,D) is defined by induction of the complexity of p as follows: 

- |has(a, c)](C, r, D) = {C, r, D C\ {{Has, a, c; l)e+r}), 

- lBel{a,p)]{C,r,D) = lip]{C,r, D (1 {{Bel, a,a{r);l)e+r} U [ip]{C,a{r)), 

- livpj{C,r,D) = l(pj{[-^y]{C),r,D), where [-^y]{C) = {h\g h,gGC}, 

- bi A P2I = bil o ^2!, 

- [has{a, c)](C, r) = {{Has, a, c; l)e+r}, 

- [Bel{a,ip)]{C,r) = {{Bel,a,a{r)-, l)e+r} U [lp\{C , a{r)) , 

- H(C,r) = {g{v)e+r\g G C), 

- [</^i A ‘P 2 ]{C, r) = [pi]{C, r) U [\‘P 2 }]{C, r). 



Proposition 5. (j, v. Bel {Clair e,v A h)) A Bel{M.ax,v Ah) = (j, v.Bel 
(Claire, v A h) A Bel{'M.ax, v A h)) holds in © 3 . 
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Proof. Let c be a set of assignments, r be a root, and I? is a set of MDSs. 

1(1 v.Bel{Claire, v A h)) A Bel{M.ax,v A /i)](C', r, D) 

= 1(1 v.Bel{Claire, v A ft-))] o |i?eZ(Max, v A ft)|(C, r, D) 

= |Bel(Max, v A ft)](|i v.Bel(Claire, v A ft)](C', r, D)) 

= iBelfMax, v A ft)](|-Be;(Claire, v A ft)|([~^](C'), r, D)) 

= |BeZ(Claire, v A ft) A Bel{M.ax, v A ft)|([~„](C), r, D) 

1(1 v.Bel{Claire, v A h) A Bel{M.ax, v A ft))](C', r, D) 



□ 



Corollary 2. In © 3 , f v takes its scope dynamically w.r.t. A. 

5 Conclusion 

We have seen three types of dynamic semantics of a language of mutual beliefs 
as an example of the revision of circular objects. Semantics which is U-based 
w.r.t. SIT cannot treat the dynamic scope-taking ^.-operator w.r.t. A. However, 
semantics based on MDSs, which are proposed as revision systems of circular 
objects here, admit the dynamic scope-taking j,-operator w.r.t. A. 
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Introduction 

First introduced by [ 121 , pomset linear logic can deal with linguistic aspects 
by inducing a partial order on words. [0| uses this property: it defines modules 
(or partial proof-nets) which consist in entries for words, describing both the 
category of the word and its behavior when interacting with other words. Then 
the natural question of comparing the generative power of such grammars with 
Tree Adjoining Grammars 0, as 0 pointed some links out, arises. 

To answer this question, we propose a logical formalization of TAGs in the 
framework of linear logic proof-nets. We aim to model trees and operations on 
these trees with a restricted part of proof-nets (included in the intuitionistic 
ones), and we show how this kind of proof-nets expresses equivalently TAG- 
trees. 

The first section presents all the definitions. Then, in the second section, we 
propose a fragment of proof-nets allowing the tree encoding and the third section 
defines the way we model operations on proof-nets. As replying to the second 
section, the fourth one allows us to come back from proof-nets to trees. Finally, 
section El shows examples of how the definitions and properties work. 



1 Definitions 
1.1 TAG 

First, extending the original definition of TAG 0 with the substitution operation 
as in rmo . we get: 

Definition 1. A TAG is a 5-uple {Vjsf,Vr,S,I,A) where: 

1. Vn is a finite set of non-terminal symbols, 

2. Vt is a finite set of terminal symbols, 

3. S is a distinguished non-terminal symbol, the start symbol, 
f. I is a set o/ initial trees, 

5. A is a set of auxiliary trees. 

M. Moortgat (Ed.): LACL’98, LNAI 2014, pp. 2.in- I^CTI 2001. 
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Initial trees represent basic sentential structures or basic categories. They have 
non-terminal nodes to be substituted for and serve as arguments to themselves 
or to auxiliary trees. A leaf (marked with a *) with the same label as the root 
node characterizes the auxiliary trees. An elementary tree is either an initial tree 
or an auxiliary tree. 

The TAGs we are considering here will always be such that every elementary 
tree has at least a (terminal) node labeled by a terminal symbol, so that the 
TAGs are lexicalized. 

Second, for refering trees and nodes in these trees, we use the notations |Z] 
defined for trees on the finite aphabet V {V = VfqVJ Vr). 

Definition 2. 7 is a tree over V iff it is a function from Dj into V where the 
domain is a finite subset of J* such that: 

1. ifq€D^,p<q, thenp€D.y; 

2. ifp ■ j G D^,j G J, then p ■ l,p ■ 2, . . . ,p ■ {j - 1) G 

where J* is the free monoid generated by J the set of all natural numbers, ■ is 
the binary operation, 0 is the identity and for q G J* ,p < q iff there is a r G 

J* such that q = p ■ r, and p < q iff b < q and p q. 

We call elements in adresses of 7 . If {p,X) G 7 , then we say that A is the 

label of the node at the adress p in 7 . We write it j(p) = X. 

Third, we require another property: 

Property 1 (w). X tree 7 satisfies the w property iff Vp G D.^ such that y(p) G Vt 
then p = q ■ 1 and q ■ 2 ^ D~f . 

It means that for a tree, if a node is terminal, labeled by a terminal symbol, 
then it is the unique daughter of its mother-node. Performing the two operations 
(substitution and adjunction) preserves this property. 

But considering TAGs whose elementary trees have the w property does not 
restrict the generated language. Indeed, if G is a TAG whose elementary trees do 
not have the w property, T(G) is the set of all the trees that the two operations 
produce in the TAG G, L{G) is the language that G generates (the set of strings 
as sequences of terminal symbol-labeled leaves of trees in T{G)) and if Gi is the 
TAG made from G in order to get the elementary trees have the zu property, 
then we have no special relation between T(G) and T(Gi). Nevertheless, we have 
L(G) cL(Gi). 

Fourth, another restriction, similar to the restriction from 0, is to avoid the 
use of trees 7 such that: 

3p G D^, j{p) = 7 (p • 1) and p ■ 2 ^ D.y 

It means there is no tree that have an A-labeled node whose unique leaf is also 
an A-labeled node. 
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Table 1. Definitions of the links 



Name 


axiom 


'S’ 




Cut 


< 


Premises 


none 


A and B 


A and B 


A and 4^ 


A and B 


R&B-graph 


A 


A B 

A>SB 


A B 


A 


A B 


Y 

A® B 


Y 


Y 

A<B 


Conclusions 


A and 


A>2B 


A®B 


none 


A< B 



1.2 Lexicalized Proof-Nets 

Proof-nets in linear logic have become familiar IH2E]. In this paper, we refer 
to PBl’s notations of proof-nets, extended to the ordered calculus It defines 
proof-nets as bicolored (Red and Blue, or Regular and Bold) graphs with the 
five links corresponding to the axiom, the tensor ((g)), the before (<), the par (’S’) 
and the cut (Cut). This calculus enjoys cut-elimination m, a crucial property 
for our modeling. 

Let us remind the main definitions: 

Definition 3 (RB-graphs). A RB-graph is a graph with couloured edges (blue 
and red, or hold and regular). B-edges are undirected. The R-edges may he undi- 
rected or directed, in which case we call them R-arcs. 



Definition 4 (Links). There are fiue sorts of links, defined as RB-graphs (see 
tahlen\). 



Definition 5 (Proof-structure). A proof structure is a RB-graph such that 
any B-edge is the conclusion of exactly one link and the premise of at most one 
link (the B-edges which are not a premise of any link are called conclusions of the 
proof-structure, they contain all the cuts), provided with a set of R-arcs between 
conclusions which defines a strict partial order. 



Definition 6 (Proof-net). An ordered proof-net is a proof- structure which con- 
tains no alternate elementary circuit 

We speak about correctness criterion, or correctness checking to speak about 
the absence of any alternate elementary circuit in a proof-structure, so that we 
know wether a proof-structure is a proof-net or not. 

^ a path of even length, starting and ending on the same vertex, using only once every 
other vertex and with alternating blue and red edges. 
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Table 2. Rewriting rules on proof-nets for cut-elimination 




Proposition 1 (Cut-elimination). Cuts can be eliminated. More precisely: 
let n be a proof-net whose eonclusions are F^, F 2 , . . . , Fk,»i, . . . ,»p ordered by 
93 (where Fi, . . . ,Ff^ are formulas and all the »i are cuts) it is possible to rewrite 
n as n' with eonclusions Fi,...,Fk ordered by the restriction 0 / 93 to these 
formulas. Moreover, this rewriting enjoys strong normalisation and eonfluence 

m- 

Table El shows the rewriting rules on proof-nets. 

We do not consider all proof-nets, but only those taking their formulas in 
the C language defined as follows: A is an alphabet of atomic formulas (we shall 
take A — Vn) and 

Bi ::= A^lA^’SBi B 2 ::= A\B2 < A 
C ::= A\A-^\A<^ A-^\A^A^\Bi>^{B2 A^) 

Moreover, we always set the 93 partial order relation to 0. 

In addition to logical formulas, we also decorate proof-nets with labels from a 
finite set of terminal symbols. Then, as a restriction of the lexicalized intuition- 
istic labeled proof-nets defined in m, we define: 

Definition 7. 1. Output: An output is either a B-edge that is labeled by a 

positive atom or the conclusion of a par-link between two atoms dual one 
from another (we eall such a conclusion a par-gate/ 

2. Intuitionistic proof-net: An intuitionistic proof-net (IPN) is a proof-net 
which contains one and only one output. 

3. Simply lexicalized proof-net: A simply lexicalized (SLIPN) is a lexical- 
ized IPN the conclusions of which are of the form: 

a) atoms or dual of atoms, 

b) output, 

c) • -^Af^>^{{Ai < ■ ■ ■ < Ajn)<^Y^) withji e [l,m]{ji < jkiff i < k) 
and every Aj: is labeled by a string Wj. (Ai G Vn and Wj G Vt), 

d) X ® with X atomic (we eall such a conclusion a tensor-gate/ 
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Note that neither the lexicalization nor the intuitionnistic property con- 
tribute to the proof-structure correctness checking. The correctness criterion 
does not change (since it’s (almost) the only one to handle the before-link, we 
would rather keep it). On the other hand, the intuitionistic feature allow the 
use of (a variant of) intuitionistic paths 0. It is stable under the operations 
we are considering and paths enable the decoding from proof nets to trees (see 
section 

2 Prom Trees to Proof-Nets 

This section defines for each elementary tree of a TAG a corresponding SLIPN 
with an induction on the height of the trees. The set of atoms for logical formulas 
comes from Vm, and labels come from Vt- 



Table 3. Initial trees maping 
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2.1 Initial Trees 



We first give the general idea of this encoding on the tree T 2 of tableEl Forgetting 
the lexicalized part, we can read this tree as a terminal node {N) preceding 
another terminal node (V) to produce their mother-node (P). We can express 
this idea with a formula of pomset logic: {N < V) ^ P. But do not forget that 
this is a brick from which we want to derive S. Moreover, proof-nets correspond 
to one-sided sequent, so that, actually, we are more intersted in the dual of such 
formula, namely: {N < M) ® P^- Thus, we shall have a SLIPN with this latter 
sub-formula, other connectives dealing with the lexicalization. 

Table El sums up the translation. Note that for h = 1, the two latter cases 
do not belong to the considered TAG. Nevertheless, we require the definition of 
their corresponding SLIPNs for the next steps of the induction. The case h = 2 
only presents the case where lexicalized subtrees’ height (at least one exists) 
is 1 and other subtrees’ height is 0 , for we deal with the cases where other 
subtrees’ height can be 1 in the next general case. For this latter, we possibly 
have {zi, . . . , ip} = 0 and in figure |T( b)| the II , • ■ ■ , II jm the inductivily built 

SLIPNs corresponding to the subtrees of 7 at , . . . , Xj^ . 




(a) SLIPNs corresponding to 
trees of height 2 



(b) SLIPNs corresponding to trees 
in the general cases 



Fig. 1. SLIPNs for higher trees 
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Fig. 2. Auxiliary trees 



Definition 8. For every initial tree T, we call corresponding initial SLIPN, or 
T transformation, the SLIPN defined as in tabled and figure^ 



Remark 1. 1. The unique output of every SLIPN corresponds to the root node 

of the tree, and every terminal node, labeled by a non-terminal symbol, of a 
tree corresponds to a conclusion of the corresponding SLIPN. 

2. There is a one-to-one maping between the non-terminal symbol labeled nodes 
of a tree and the axiom links of the corresponding SLIPN. 



2.2 Auxiliary Trees 

Let 7 be an auxiliary tree and let us define y"*" as the same tree as 7 except 
for its X* node replaced with an X node. We call r the adress of N* in 7 so 
that 7(r) = X* and 7^(r) = X. Then, following the definition in the previous 
section, we can define 77+ the SLIPN corresponding to 7+. And, as 7 is an 
auxiliary tree, 7+(r) = 7+(0) and 77+ has a conclusion X (corresponding to 
7+(0)) and a conclusion A+ (corresponding to 7+(r)). 

Thus we define 77 the corresponding SLPIN to 7 as the proof-net built from 
77+ in binding with a 'S’-link its X and its A+ conclusions (see figure EJ . 77 is a 
(correct) SLIPN. 

Definition 9. For every auxiliary tree 7, we call auxiliary corresponding 
SLIPN, or 7 transformation, the SLIPN defined as above. Then, for every ele- 
mentary tree, we call corresponding SLIPN the initial or auxiliary SLIPN cor- 
responding to that tree. 



3 Elementary Operations 

This section deals with a particular case of the next section but focuses on the 
core operations which we can refer to during the generalisation. 



Lexicalized Proof-Nets and TAGs 



237 



X 



I 

I 

I 



I 

I 

I 



X X^ X x^ 




Fig. 3. Adding a tensor-gate 



3.1 The Substitution Operation 

Let 7i and 72 be two trees such that we can substitute 72 to a terminal node 
X (whose adress is r) of 71, and let ili and II 2 be their corresponding SLIPNs. 
Then 71 (r) = 72(0), and (cf. remark 7Ti has a conclusion and the output 
of II 2 is X. 

Thus we can bind these conclusions with a cut-link and yield a new SLIPN 
from which we eliminate the cut and obtain a new SLIPN II. 

Definition 10. For every tree 7 resulting from the substitution of 72 to a node 
0/71, we define its corresponding SLIPN II as above. 



3.2 The Adjunction Operation 

Preparing the Target Tree. In order to allow the adjunction of an auxiliary 
tree on a target tree, we need to modify a little bit this latter. 

Let 7 be the tree on which we want to perform an adjunction, r the adress 
of the node where to perform the adjunction, and II the corresponding SLIPN. 
As noted in remark [0 II contains an axiom link 7(c) I I 7(r)'’“ corresponding 
to the node 7(r). So we can split this link into two axiom-links linked with a 
tensor- link and we obtain (with X = 7(r)) a new SLIPN II' as shown in figure El 

Proposition 2. Adding a tensor-gate preserves the correctness of the proof-net. 

Actually, such a conclusion is an instance of cuts (as a cut is equivalent 
to a conclusion (3A)(A G) A)). Then if this tensor-gate remains unused, cut- 
elimination amounts to delete this tensor-gate and come back to a simple axiom- 
link. The second example of section |S| uses this feature at the very end of the 
derivation. 



Performing the Operation. In this section, we define the SLIPN correspond- 
ing to the result of adjoining the tree 72 to 71. In this preliminary case, both 71 
and 72 are elementary trees (72 is an auxiliary tree and 71 is the target tree). 
And III and II2 correspond to them. 

We assume Ui already has a tensor-gare added on the axiom corresponding 
to the node where we want to adjoin 72. So that if X labels the stared-node of 
72, then X also labels the node of 71 receiving the adjunction and we have a 
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tensor-gate conclusion X ® X-^ for 77i and a par-gate X^X-^ (the output) for 
II2 (merely by construction). 

Thus, we can bind ili and II2 with a cut-link and eliminate this cut to obtain 
a new SLIPN (because there is no modification on the lexicalized parts, still no 
alternate elementary circuit and still only one output: 71 ’s one). 

Definition 11. For every tree 7 resulting from the adjunction of the auxiliary 
tree 72 on 71, we call the corresponding SLIPN the SLIPN built as above. 



3.3 Operations on Derived Trees 

Up to now, we defined a way of modeling elementary trees in the framework 
of SLIPNs, and a way of combining these SLIPNs to model the adjunction and 
substitution operations on elementary trees. We now are about to extend this 
modeling on derived trees, so that a SLIPN will correspond to every tree (ele- 
mentary or derived) of a TAG. 

Definition 12. For a tree 7 and an elementary tree Eq, we call derivation 0/7 
the pair < Eq, ((oi, ifi, 71), . . . , (o„, E„, 7„)) > such that 7n = 7 and for every 
i G [l,n], 7i results from the operation Oi (adjunction or substitution) between 
7i_i and the elementary tree Ei. 
n is the length of the derivation. 



Remark 2. For a derived tree, the derivation is not necessarily unique. 



Definition 13. Let j be a derived tree from the derivation d. Then we can define 
the d-SLIPN corresponding to 7, built only with cut and cut- elimination (between 
unlabeled conclusions) from the SLIPNs corresponding to the elementary trees of 
the derivation. 

Actually, proving the existence of this SLIPN interests us more than the 
simple definition, as it also gives its construction’s steps. 

Proof. We prove the existence of by induction. We also prove the property 
that if 7 has a terminal node (except for the stared node of an auxiliary tree), 
labeled by a non terminal symbol X, then has a pendant conclusion A-*- 
(corresponding to this node, hence not labeled neither). 

1. if ? = 0 : 7 is an elementary tree, and we already defined its transformation 

And its construction also proves the property of the pendant conclu- 
sion. 

2. if I > 0 : Let di-i =< 70, ((oi, Ai, 71), . . . , (o^ - 1, A;_i, 7/_i)) > and 
jj{di-i) jjg SLIPN corresponding to 7/_i in the d-derivation. 
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a) If O; is the substitution of a leaf X of ji-i by Ei (whose transformation is 

TTi) then has an axiom-link X I I X-^ in which X-^ is a pendant 

conclusion (induction hypothesis), and tt; has an axiom-link X I I X-^ 
in which A1 is a pendant conclusion (i?/(0) = X). Then we can link 
these two pendant conclusions with a cut-link, and eliminate it. This 
yields a new SLIPN. It also proves the property of pendant conclusion, 
as every terminal node of the new tree, labeled with a non-terminal 
symbol, already was terminal in one or another of the two trees so that 
(by induction hypothesis) they already had the property. 

b) if O; is the adjunction on the leave X of 7 ;_i of the auxiliary tree Ei 
(whose transformation is tt;) then 7 ;_i has an axiom- link E I I E^ 
we can replace (with respect to the SLIPNs class belonging) with two 
axiom-links linked together with a tensor- link (i.e. we add a tensor gate 
X®X^ as for adjunctions on elementary trees) . tt; has a par-gate X>^X^ 
so that we can bind the two gates with a cut-link, and then eliminate 
this latter. We obtain a new SLIPN, and as above, the induction proves 
the property of the pendant conclusion. 

□ 

Remark 3. This shows that cuts are only between atomic formulas or tensor and 
par-gate. Actually, the grammar given for the conclusions of SLIPNs indicates 
that no other cut can occur. 

During this section, we made the assumption of allowing adjunctions at ev- 
ery time on every node. Of course, sometimes we do not want such possibilities. 
Allowing the tensor-gate addition only in the lexicon, and not during the deriva- 
tion, brings a solution to this option. 

Section 0 shows examples for both cases. In particular, the mildly-context 
sensitivity of TAGs, generating {a”6"c"’d"}, illustrates the second case. 

4 Prom SLIPNs to Trees 

So far, we explained how, given a TAG and a derived tree in this TAG, we could 
obtain a SLIPN that we qualify as corresponding. But we now have to see how 
this SLIPN actually corresponds so that we shall henceforth be able to handle 
only proof-nets and translate the results on trees. 

4.1 Polarities 

Let us define a positive polarity (°) and a negative one (*). Every formula is 
inductiveley polarized as follows: if a is an atom then a° and Then, for 
each link we define the polarity of the conclusion from premises’ ones as in 
table 0 

We call input the negative conclusions, and output the positive one (which is 
coherent with the previous definition of the output) . 

The grammar on SLIPNs’ conclusions shows that SLIPNs are polarized. 
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Table 4. Polarities of the conclusions 





0 • ,g) 


0 . ^ 


0 • 


0 


0 • 0 


0 0 


0 


• 


• • 


0 • • 













4.2 Reading of SLIPNs 

We give an algorithm for the reading of any cut-free SLIPN, based on formulas’ 
polarities. It uses a very simple principle: following from a starting point (namely 
the output) the positive polarities, we define a path across the proof-net. And 
every time the path crosses an axiom-link, we add a node to the tree under 
construction. Actually, we build both the function 7 and D^. Of course, the 
path can not cross twice the same axiom-link (such a possibility would occur 
only with par-gate). 

As we shall see later, the path never cross a par-link (except par-gates, from 
the positive conclusion), always enter a tensor-link through a negative premise 
and always enter a before-link throuh the positive conclusion. So, because of 
the before-link, the path is not linear (both premises, positive ones, are likely to 
be the next on the path), and we define the first branch as the path from the 
positive premise of the before-link at the beginning of the arrow, and the second 
branch as the path from the other premise. 

Then, when adding a new node on the tree, its mother-node is the the last 
node met on the same branch of the path (in a before-link, both premises are on 
the same branch as the conclusion, but they are themselves on different branchs; 
other connectives do not create branchs). So that if its mother-node’s adress is 
p, then the new node’s adress is p- j with j G IN* and for all i such that 0 < f < 
j,p-iG Dj 

Then, we state the algorithm as follows: 

1. Enter the net through the only output (so, if X is the output, 0 S Dj, and 

(0,A) ey). 

2. Follow the path defined by the positive polarities until reaching an atom 
(when a before-link is crossed, first choose the premise at the beginning of 
the red arc) and cross it. If its conclusions are X and X^, we define its 
adress p as precised above (wrt the branchs) and p G and (p,X) G 7 . 

3. a) if the input is lexicalized, then lexicalize the last written node of the tree 

under construction (if the lexicalization is x, we then add p • 1 G and 
(p- l,a;) G 7 ). Either there is no more link after, or this input is premise 
of a par-link whose other polarities are negative. In both cases, come 
back to the last before-link the path did not go through the two branchs 
and make as in 0 
b) else juste follow as in El 

4. Stop when the path joined all the atoms. 
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The next section will show that every SLIPN built from elementary SLIPNs 
can be read such a way and that the path cross every axiom-link. 

Remark 4 . 1. This reading provides a unique correspondance between any ax- 

iom link of the SLIPN and a node (labeled by a non terminal symbol). 
Moreover, if the negative conclusion of an axiom-link is pendant and not 
lexicalized, then it corresponds to a terminal node of the tree. 

2. Two different SLIPNs can have the same reading. It underlines the impor- 
tance of making precise a base of elementary SLIPNs (corresponding to the 
elementary trees of a given TAG) . 



4.3 Prom SLPINs, Back to Trees 

We now have both a maping from trees to SLIPNs, and a maping from SLIPNs to 
trees. It remains us to see if the composition of these mapings gives the identity. 
This consists in three steps: 

1. check that the reading of a SLIPN corresponding to an elementary tree is 
the same as the elementary tree; 

2. check that the reading of a SLIPN corresponding to the substitution between 
two trees is the resulting tree; 

3. check that the reading of a SLIPN corresponding to the adjunction between 
two trees is the resulting tree. 

Moreover, given a TAG, the basic bricks we consider are SLIPNs corresponding 
to elementary trees of this TAG. Then the only way to build new SLIPNs is 
binding them with cut between unlabeled conclusions. 

Let us remind the definition of subtrees and supertrees as in 0 

Definition 14. Let j be a tree and p G D^. Then 

l/p = {{tX)\{p- q,X) G7,qG J*} 
l\p= {{(},X)\{q,X) G 7 ,p ^ q} 

7 /p is called the subtree of 7 at p and ^\p is called the supertree of 7 at p. 
Further, for p G J* 

P-l = {{P ■ q, X)\{q, X) G 7 } 



Property 2. j = 'j\p Up- ( 7 /p) for every tree 7 and p G D^. 



Remark 5. If the reading of a SLIPN 77 gives 7 , and if we make the path begin at 
any positive conclusion of an axiom link that corresponds to the node at adress 
p in 7 , then the reading algorithm returns 7 /p. And of course, the reading of 77, 
with a pruning at the same axiom link returns "f\p. 
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Fig. 4. Substitution 



Elementary Reading. Let 7 be an elementary tree. To prove the reading of 
the SLIPN corresponding to 7 being 7 itself, we simply proceed by induction, 
with the same steps as for the building of elementary SLIPNs. 



Substitution. Let us consider two SLIPNs 7Ti and II2, whose readings are 
respectively 71 and 72. We assume II i was built (with cuts) from elementary 
SLIPNs and II2 is an elementary SLIPN itself. 

Proposition 3 . If a negative (not labeled) atomic conclusion of IIi and a pos- 
itive atomic conclusion of II2 support a cut-linking, then the corresponding ter- 
minal node 0/71 accepts a substitution by the root 0/72. And the reading of the 
new SLIPN, after cut- elimination, corresponds exactly to the resulting tree. 

Proof. Let p be the adress of X in the reading 71 of ili, where X corresponds to 
the axiom link of figure 2] (on the left). The adress of X in the reading 72 of II2 
is 0 (the root node) and the reading of 72 starts at this X axiom-link. On the 
other hand, the reading of 71 stops at X for its branch. After the cut and the 
cut-elimination, the reading of the new SLIPN (on the right of figure EJ starts 
at the output, which also was the output of ili, and continues like for 71 until 
the new X axiom-link is reached. Its adress in the new tree 7 is also p. There, 
the reading of 72 takes place. So that, as defined in the algorithm, if 7 is the 
reading of the new SLIPN, 



Vg e D.^2,7 (p- g) = 72(g) 

and nothing changes for the remaining reading: it is the same as for 71 (because 
7i = 7\p). Then the reading 7 of 7T is such that 



7 = 71 U p • 72 

which corresponds to the definition of the substitution of the 71 (p) node with 
72- □ 



Adjunction. As above, let us consider two SLIPNs 77i and II2, whose readings 
are respectively 71 and 72. We assume Ui was built (with cuts) from elementary 
SLIPNs and II2 is an elementary SLIPN itself. 
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Proposition 4. If a tensor-gate of Ui and the unique positive conclusion of 
II2 support a cut-linking, then the node corresponding to the tensor-gate on 
7i accepts an adjunction of 72 . And the reading of the new SLIPN, after cut- 
elimination, corresponds exactly to the resulting tree. 

Proof. In the following, when dealing with 71, we speak about the reading of 77 i 
without the tensor-gate. 

Let us consider II on figure|^(the SLIPN on the right). The input of II is the 
same as 77 i. As the positive conclusion of an axiom- link always occurs before the 
negative conclusion in the path, then new tree 7 from U, after reaching X in Ui 
cross the axiom- link to II2, so that ^ jp ^ ^ijp but ^\p = 71 \p (see remark 0. 
Yet, The X-^ on II2 is the other conclusion of the starting axiom-link for 72. So 
that at the p adress, for 7, we read 72. Then for every q € y(p • q) = 72(9). 

Moreover, reaching the X of II2, the path does not stop but continue with 
the remaining part of 71, namely 71 /p. So that at the new adress of X of II2 in 
7 we add 71 /p. And the new adress of A in 7 is p • r (with r the adress of X* in 
72). 

Then 

7 = 7;^\p u p • 72 U p • r • 7i/p 

which is the definition of the adjunction of 72 on 71 at A. □ 

Eventually, we can state the next propositions: 

Proposition 5. Every derived tree (from an elementary tree lexicon) corre- 
sponds to the reading of a SLIPN, the latter resulting from Cut operations be- 
tween SLIPNs corresponding to the elementary trees of the lexicon. 

Reciprocally, with a lexicon of elementary SLIPNs corresponding to trees, 
with the restriction of Cut operations on formulas that are not lexicalized, the 
reading of the resulting SLIPNs are the derived trees. 

Proof. This is immediate after the previous propositions. □ 
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Table 5. Lexicon 




5 Examples 



5.1 Substitution and Adjunction 

First let us define from the lexicon of the TAG the corresponding elementary 
SLIPNs. We assume the lexicon of table 0 This lexicon can yields the trees of 
figure El (for the first tree: substituting N in T 2 with Ti, then adjoining on 
the result. For the second tree: continue with the adjunction of T4). But we can 
also make this derivation on the SLIPNs as shown in the figures 0 and 0 

Let us see how to read the SLIPN of figure 7(e) , and obtain the derived tree 
of figure 6(a) first the path enters the net through the atom P (the unique 
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dort beaucoup 
(b) Derived tree 



Fig. 6. Resulting trees 



output) and marks P as a node. Then it follows the positive polarities and 
reaches a before-link with two branches. It first chooses the positive formula 
at the beginning of the red (regular) arc and crosses an axiom-link. There is a 
negative lexicalized atom. So on the tree we add a lexicalized {Jean) node N . 
Then doing the same with the other premise of the before-link we get a new atom 
V and the negative conclusion of the axiom-link is not lexicalized. So we have 
a junction which will produce new branches under the V node. They are two 
simple branches: one is a (lexicalized by dort) V , and the other is a (lexicalized 
by heaucoup) Adv. 

To have a deeper adjunction, let us continue with the adjunction of T4. 
Figure |HI shows the different steps of the operation. But we leave the reader 
check that polarizing the resulting SLIPN of figure ?? and reading it leads to 



the tree of figure 6(b) 



5.2 A Formal Language 

As in the previous section, we first define the lexicon of table El Note that in 
this lexicon, the tensor gate belongs to the lexical item, so that we shall never 
use a tensor-gate addition during a derivation. 

We only initiate the use of this lexicon with an adjunction of T2 on another 
instance of T2 (figure | 9 (a)| ), then an adjunction on Ti, resulting in the SLPIN 
of figure At every adjunction on T2, a new tensor-gate appear (provided 
by T2), as the former (on the derived SLIPN) disappears with the adjunction 
operation (the cut between the par-gate and the tensor gate of T2). As the reader 
can check, the reading actually corresponds to our expectations and generates 
the word aabbccdd. 

Thus, without splitting any axiom-link and adding any tensor-gate during 
the derivation, we can generate the language {a"6"c"d”}. 
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(a) Substitution of the N node of 
dort by Jean 




(b) Substitution (contin- 
ued): cut-elimination 




(c) Addition of a tensor- (d) Adjunction of beaucoup on Jean dort 

gate on Jean dort 




(e) Adjunction (continued): cut-elimination and polar- 
ization 



Fig. 7. Operating on SLIPNs 



248 S. Pogodalla 




(a) Adjunction of T 2 on another instance of T 2 




(b) Cut-elimination 




(c) Adjunction onTi and cut-elimination 



Fig. 9. Generating aabbccdd 
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Table 6. Lexicon 




Conclusion 

Using a restricted fragment of pomset intuitionistic proof-nets, we showed how 
to generate the same language as a TAG. This indicates how a more generic for- 
malization (namely El’s one) allows both keeping generative power and dealing 
with some linguistic phenomena not by lexical rewriting rules on trees, but by 
lexical definitions. For instance, we can compare the modeling of clitics in [Q or 
in |TT). 

We also want to underline that we do not really use the partial order capa- 
bilities of pomset proof-nets: the before-links arrange totally the atoms in order. 
Of course, this results straightforwardly from the fact that the order in the trees 
is total, so that the same occurs in the SLIPNs with respect to the before-links. 

Moreover, we use both commutative and non-commutative connectors, and 
the building of the path defines the order of the lexical items. The path performs 
the splitting of the sequent required in the rules (especially the adjunction rule) 

of 0- 

Finally, to know how to express the semantics, at least two possibilities arise: 
to see it as for intuitionistic proof-nets jS] , or as having an alternative expression 
like with the derivation trees (trees that track the operations performed during 
a derivation) . 
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Abstract. We investigate natural deduction proofs of the Lambek cal- 
culus from the point of view of tree automata. The main result is that 
the set of proofs of the Lambek calculus cannot be accepted by a finite 
tree automaton. The proof is extended to cover the proofs used by gram- 
mars based on the Lambek calculus, which typically use only a subset 
of the set of all proofs. While Lambek grammars can assign regular tree 
languages as structural descriptions, there exist Lambek grammars that 
assign non-regular structural descriptions, both when considering nor- 
mal and non-normal proof trees. Combining the results of Pentus (1993) 
and Thatcher (1967), we can conclude that Lambek grammars, although 
generating only context-free languages, can extend the strong generative 
capacity of context-free grammars. Furthermore, we show that structural 
descriptions that disregard the use of introduction rules cannot be used 
for a compositional semantics following the Curry-Howard isomorphism. 



1 Introduction 

In this paper, we investigate the strong generative capacity of Lambek gram- 
mars, using natural deduction proof trees as the notion of structure assigned by 
Lambek grammars to strings they generate. 

This approach differs from Buszkowki’s (1997) definition of structure which 
he formalized in terms of functor/ argument structures (/-structures) and phrase 
structures (p-structures) . His definition led, in the case of Lambek grammars, 
to some unintuitive results, most notably the structural completeness theorem, 
which states that any binary tree that can be assigned to a string at all, can be 
assigned as a structural description of that string by any Lambek grammar that 
can generate that string. 

The strong generative capacity of Lambek grammars fell into disrepute be- 
cause of that result (cf. Steedman, 1993), which caused some researchers of proof 
theoretic grammars to turn their primary attention towards systems that do not 

* The results of this paper are contained in my dissertation (Tiede, 1999). I would like 
to thank my thesis advisor Larry Moss, Johan van Benthem, Christian Retore, and 
two anonymous referees for comments on this paper. All remaining errors are the 
author’s. 

M. Moortgat (Ed.): LACL’98, LNAI 2014, pp. 251 -^^ 2001. 
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suffer from this defect, like non-associative Lambek grammars (cf. Moortgat, 
1997). However, Buszkowski had also shown that non-associative Lambek gram- 
mars cannot extend the strong generative capacity of context-free grammars 
(Buszkowski, 1997) using his definition of structure. Thus, we arrive at a prob- 
lem for proof theoretic grammars: either they do not extend the strong generative 
capacity of context-free grammars, or they do so in a meaningless way.0 

The main result of this paper is that we can avoid the conclusion about the 
strong generative capacity of Lambek grammars if we consider natural deduction 
proof trees as the structure assigned by Lambek grammars, since they extend the 
strong generative capacity of context-free grammars but, in the case of normal 
form proofs, are not structurally complete. In addition, we show that the notions 
of p-structures and /-structures cannot be used for assigning meanings to strings 
using the Curry-Howard isomorphism. 

In this paper, we assume a certain familiarity with proof theoretic methods 
in mathematical linguistics, which are covered in Buszkowski’s (1997) survey 
article. 



2 Proof Theoretic Grammars 

Proof theoretic grammars can be described as formal grammars which assign to 
each element of a terminal vocabulary S a finite set of formulas built from a 
countable set of propositional variables using the connectives / and \. We will 
write 

a : A 

if a G Z" is assigned the formula A by the grammar. In addition, proof theoretic 
grammars have a distinguished atomic propositional variable, typically s, for 
sentence, and the following axioms and rules of inference: 

H [ID] 



A/B B , B B\A , 



[B] 


[B] 


^ // 
A/B ' 


A 

B\A 



Basic categorial grammars, as formalized by Ajdukiewicz and Bar-Hillel, do 
not use the rules [//] and [\/]. Lambek grammars use all of the above rules. 

^ However, the complexity of the strong generative capacity of modal Lambek gram- 
mars appears to be still open. 
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A string ai • • • a„ G S* is generated by a proof theoretic grammar if there are 
formulas Ai, . . . , An, such that a^ : Ai, 1 < i < n, and 

A\ , • • • , An b s 

using the rules of inference allowed by the particular grammar. 

Gaifman’s, Buszkowski’s, and Pentus’ studies of the weak generative capacity 
of categorial grammars, i.e. the string languages generated by categorial gram- 
mars, can be summarized as follows: basic categorial grammars, non-associative 
Lambek grammars and associative Lambek grammars generate precisely the 
context-free languages (cf. Bar-Hillel, 1964; Buszkowski, 1997; Pentus, 1993). 
These results raise the question whether there are advantages to using cate- 
gorial grammars rather than context-free grammars. For all practical purposes, 
context-free grammars certainly are easier to work with than Lambek grammars; 
for example the only known polynomial time algorithm for membership in the 
language generated by a Lambek grammar is to translate the Lambek grammar 
into its corresponding context-free grammar, using Pentus’ construction, and 
checking membership using that context-free grammar (cf. Finkel and Tellier, 
1996). 

In defense of Lambek grammars van Benthem (1995) pointed out that they, 
even if their generative capacity would only equal that of context-free grammars, 
surpass context-free grammars in terms of their strong generative capacity, i.e. 
in terms of what structures they can assign to the strings they can generate. 
However, in 1986 Buszkowski proved a theorem about the strong generative 
capacity of Lambek grammars that, while affirming that the strong generative 
capacity of Lambek grammars extends that of context-free grammars, shows 
that they do so in a way that trivializes the notion of strong generative capacity. 
He showed that any binary tree that can be assigned to a string at all, can be 
assigned as a structural description of that string by any Lambek grammar that 
can generate that string. Buszkowski called this property of Lambek grammars 
“structural completeness.” 

However, upon close inspection of Buszkowski’s proof, it becomes apparent 
that the structural completeness theorem depends on the particular way in which 
“structure assigned by a Lambek grammar to a string it generates” is defined. In 
particular, the definition employed by Buszkowski does not take into account the 
structure of proof trees, which should be at the center of this definition, since, 
when Lambek grammars are employed for syntactic descriptions, the proof trees 
form the basis of a compositional semantics of natural languages, based on the 
Curry-Howard isomorphism. 

3 Strong Generative Capacity 

In mathematical linguistics, the notion of strong generative capacity is used to 
distinguish between the mere ability of a grammar to generate a language and 
the structure that it assigns to the strings in its language. The strong generative 
capacity of a grammar is the set of structures (derivation trees, in the case of 
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context-free grammars) that are assigned by the grammar to the language it 
generates. 

The notion of strong generative capacity provides a formal model of the 
linguistic notion of structure. In particular, the derivation trees of context-free 
grammars were intended to provide a formal model of the notion of constituent 
structure. Furthermore, the principle of compositionality (cf. Janssen, 1997) stip- 
ulates that semantic representations be defined on structures assigned to strings, 
rather than on the strings themselves. 

Recent research in a wide variety of grammar formalisms has stressed the 
importance of their strong generative capacity. This has resulted in a renewed 
interest in tree formalisms, such as tree automata, tree grammars, and logical 
formalisms for the description of trees (see Rogers, 1998 and most of the contri- 
butions to this volume). 

In the context of categorial grammar, Buszkowski has established a variety 
of results concerning strong generative capacity. However, he uses a definition 
of structure assigned by a grammar to strings that is different from the one 
used here. The main difference between the two definitions is that Buszkowski 
considers structures to be determined by proofs, whereas here the proofs are 
the structures. In the case of classical categorial grammar, the two definitions 
coincide. However, this is not the case for grammars that can use introduction 
rules, as Buszkowski’s definition essentially only considers the elimination part 
of the proof tree. 

In addition to Buszkowski’s results, there have also been related investigation 
in the complexity of the A-terms associated with proofs of the Lambek calcu- 
lus (van Benthem, 1987; Moerbeek, 1991). However, these were concerned with 
studying the A-terms as strings, rather than as structured objects like trees. 

The investigation of the strong generative capacity of grammar generalizes 
the theory of formal (string) languages to tree or term languages. Terms represent 
derivation trees of grammars. We review some of the basic concepts of tree 
automata and tree languages, for a more complete reference see Gecseg and 
Steinby (1984). 

Definition 1. A tree is a term over a finite signature U containing function 
and constant symbols. The set of n-ary function symbols in S will be denoted by 
En- The set of all terms over E is denoted by T^; a subset ofT^ is called a tree 
language or a forest. The yield of a tree t, denoted by yield (f), is defined by 

yield (c) = c, c G Eq 

yield (/ (G, . . . t„)) = yield {h) ■ ■ ■ yield (t„) , / G A7„, n > 0. 

Most of the research on tree languages has been conducted with respect to regular 
tree languages. There are many equivalent definitions of regular tree languages; 
we will use regular tree grammars. 

Definition 2. A regular tree grammar is a system {E, T, S, A) , such that E is 
finite signature (the terminal vocabulary) , T is a finite set ofO-ary non-terminals, 




Lambek Calculus Proofs and Tree Automata 



255 



S € r is the distinguished start symbol, A is finite set of rules of the form, 

A — y t 

with A € r and t G T^ur- These rules are interpreted as rewrite rules, just as 
in the case of string grammars. 

Definition 3. A tree language is regular if it is generated by a regular tree gram- 
mar. 

Regular tree languages have Myhill-Nerode characterization similar to that 
of string languages. The Myhill-Nerode theorem for tree languages will be our 
primary tool for proving tree languages non-regular. 

Definition 4. A context is a term over N'U{a;} containing the zero-ary term x 
exactly once. 

Theorem 1. (Myhill-Nerode): A subset S C Ts is regular iff the equivalence 
relation, ~gC x T^, defined by 



n T2 



iff for all contexts r {x), 

t[x ^ Ti] & S iff t[x ^ T 2 ] G S 

determines finitely many equivalence classes, where [a; >-->• n] denotes substituting 

X with T. 

Proof. See Kozen 1997. I 

The relation between regular tree grammars and the derivation trees of 
context-free string grammars was established by Thatcher: 

Theorem 2. (Thatcher, 1967) If S is the set of derivation trees of some context- 
free grammar, then S is regular. 

The converse of this theorem does not hold, as there are regular tree languages 
that are not derivation trees of context-free grammars. However, the yield of any 
regular tree language is a context-free (string) language. 

4 Proof Tree Languages 

In general, we will consider structures generated by categorial grammars as term 
languages over the signature S = {[/ E] , [\E] ,[//], [\/] , c}, where 

— c is a zero-ary function symbol, 

— [\/] and [//] are unary function symbols, 

— [\if] and [/ E] are binary function symbols. 
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The terms over this signature represent proof trees that neither have infor- 
mation about the formulas for which they are a proof nor the strings that are 
generated by a grammar using this proof. The reason for using this, somewhat 
impoverished notion of structure is that it provides an abstract definition using 
a finite signature, making it possible to consider the proof trees of every Lambek 
grammar as a language over this signature. 

It should be noted that these terms represent proofs unambiguously; we can 
decide solely on the basis of the unlabeled proof trees which leaf is withdrawn by 
an application of [\/] or [//] without annotating the proof, since they withdraw 
the leftmost or the rightmost uncancelled hypothesis, respectively. However, this 
is the case only for the Lambek calculus, because of its lack of structural rules. For 
other logics, it would be necessary to annotate the proof trees at least with which 
premise is withdrawn and how many instances of this premise are withdrawn, 
which would not be possible to do with a finite signature. 

As a matter of fact, these proof trees can be used as a variable free formal- 
ization of the bi-directional A-calculus (cf. Wansing, 1993) . Although we are not 
concerned with the labelling of the leaves with formulas here, as that would ne- 
cessitate the introduction of infinitely many labels for leaves, it should be noted 
that we can use a principal type algorithm to provide an unlabeled proof tree 
with the principal pair for which it is a proof, i.e. given an unlabelled proof tree 
t, we can compute a pair (T, a), such that t is a proof tree of 

T h a 

and any other pair (A, f3) such that t is a proof tree of 

Ah/3 

is a substitution instance of {F, a). This algorithm can be directly modeled after 
the Hindley-Milner algorithm for the simply typed A-calculus (cf. Hindley, 1997). 

In order to study this formalization of proof trees, we need to consider how 
many cancelled and uncancelled assumptions are present in a proof tree, which 
is defined inductively: 

Definition 5. c contains one uncancelled assumption and no cancelled assump- 
tions. //ti and T 2 contain m and n uncancelled assumptions and k and I cancelled 
assumptions, respectively, then [/E] {t\,T 2 ) and [\A] (ti,T 2 ) contain m-\- n un- 
cancelled and k-\-l cancelled assumptions, and, form > 1, [//] (ri) and [\/] (ti) 
contain m — 1 uncancelled and k-\- 1 cancelled assumptions. 

4.1 Structures Assigned by Categorial Grammars 

The structures that are assigned by classical categorial grammars are defined 
with respect to the category that they assign to strings. 

Definition 6. If a € E has category A (i.e. a : A), then a is assigned the struc- 
ture c. Ifw,v G E* are assigned the categories A/B and B and structures t\ and 
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t 2 , respectively, then wv has category A and is assigned the structure [/E] (^ 1 ,^ 2 )- 
If they are assigned categories B and B\A, respectively, then vw has category A 
and structure [\£1] (^ 1 ,^ 2 )- 

It is easy to show that the structures assigned by categorial grammars to 
strings they generate are regular tree languages. In fact, this immediately follows 
from the construction of Gaifman’s theorem. It has also been proved directly by 
Buszkowski using the Myhill-Nerode theorem. 

An even easier proof can be given using regular tree grammars. 

Proposition 1. The set of derivation trees of a categorial grammar is regular. 

Proof. Since the grammar assigns a finite set of category, the set of all subfor- 
mulas of assigned categories is finite. We will use this set as the non-terminals, 
i.e. the non-terminals are 

r = {A I A G sub (B) ,3a G S,a : B} 

where sub{B) denotes the set of subformuals of B. The rules of the grammar 
contains for all A, B, A/B G E 

A^[/E] (A/B,B) 



and for all A, B, B\A G E 



[\A] (B,B\A). 

Furthermore, if a : A, for some A G E, then 

A — >• c 



is a rule. 



I 



4.2 Structures Determined by Proofs 

As was pointed above, in the case of Lambek grammars Buszkowski introduced 
a notion of structure assigned by a Lambek grammar that differs from the one 
used here. In order to define his notion of structure, he notes that the Lambek 
calculus can be axiomatized using as axioms all the 

Ah B 



such that 



Ah B 



is provable in the Lambek calculus, and as rules of inference \/E] and [\E] . This 
is true, since if 

Ai, . . . , A„ h B 

is derivable in the Lambek calculus, then so is 
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Ai h (. . . ((BM„) . . •) M2) 



and 

^1) • ■ • ) M B 

is then derivable using \/E], 

A, h Ai, 

and 

rii h (• • • {{B/An) ■ ■ •) M 2 ) • 

Using this axiomatization, Buszkowski is able to assign to strings derivable 
by a Lambek grammar structures over the signature [/ E] , [\E] using the rules 
of how structures are assigned by categorial grammars with the additional rule: 
if re S if* has category A, w is assigned structure t, and 

Ah B 

is derivable in the Lambek calculus, then w has category B and is assigned 
structure The structures built from [/E] and [\if] are called /-structures 
(for function/argument structures) by Buszkowski. Buszkowski also considers 
structures built from a single function symbol [if], which are called p-structures 
(for phrase structures), p-structures are the homomorphic image of /-structures 
using the following mapping: 

h{c) = c 

h ([\if] (ti, t2)) = h {[/E] {EM)) = [E] {h (U) , h {E)) . 

The first criticism concerning his definition is that it does not take into ac- 
count the structure of the proof that actually establishes that a particular string 
is generated by a grammar. The second criticism is that his definition of structure 
leads to structural completeness. The last criticism of his definition is concerned 
with the question of whether it is possible to use Buszkowski’s definition of 
structure as a basis for a compositional semantics based on the Curry-Howard 
isomorphism. I believe that this is an important point to consider, since the 
main purpose of the notion of structure in linguistics is to use it as a basis for 
semantics, i.e. to define a (compositional) function that assigns meanings on the 
basis of the structure assigned by some grammar. However, it is not possible to 
use Buszkowski’s notion of structure for this purpose. 

Proposition 2. There is no function f from p-structures or f -structures to 
proof trees, such that every normal-form proof tree that is a proof of 

A\, ■ ■ ■ , An h B 

is assigned to some p-structure or f -structure determined by 

Al, . .. ,Anh B. 

^ To be precise, Buszkowski does not use the same notation as I do, however, both 
notations convey the same meaning. 
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Proof. The easiest way to show this, is to demonstrate that already the base 
case fails, i.e. that there is a sequent 



Ah A 

that has two (or more) normal form proofs. If some symbol, a G H, is assigned 
A by a Lambek grammar, then there is only one p -structure or /-structure that 
can be assigned to a. 

Consider the following sequent which has two normal form proofs in the 
Lambek calculus: 

(A/(A\A))\Ah(A/(A\A))\A 
The first proof is a simple application oi ID: 

{A/{A\A))\A, 

while the second is somewhat more complex: 

[^] [^\^] . ^ 

A 

A/{A\A) {A/{A\A))\A 

A 

[A/{A\A)] 

A ^ 

(A/(A\A))\A 

Which completes the proof. I 



4.3 Proofs as Structures 

If we want to consider proofs to be structures, we have two options to consider: 
either we consider every proof that establishes that a certain string is generated 
by a Lambek grammar to be the structures assigned to that string by that 
grammar, or we only consider those proofs that are in normal-form. It should 
be noted that in the second case neither the weak generative capacity nor the 
semantic expressibility of the grammar changes. We will consider both options, 
as they differ with respect to their formal properties. 

We first consider the set of all proofs of the Lambek calculus, although no 
grammar could ever use the set of all proofs. Let us call the set of proofs of the 
Lambek calculus Tjj- 

Theorem 3. The set of well-formed proof trees of the Lambek calculus is not 
regular. 

Proof. Consider the following context 



r,{x) = [\I]{[\E] (x,c)). 
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Any tree that can be substituted for x has to contain at least one uncancelled 
assumption. If it doesn’t, then the term is not well-formed, as we cannot with- 
draw the premise to its left and arrive at a well-formed proof tree in the Lambek 
calculus. Let us generalize the above observation and define 

T„+i (x) = [\/] (r„ (x)) . 

Also, for all n > 0, let 7 t„ be an arbitrary well- formed proof tree containing n + 1 
uncancelled assumptions. Into any r„ (x) we can only substitute a tree with at 
least n -I- 1 uncancelled assumptions. Therefore, for all k, I, such that k ^ I, 



TTfc '^Tn TTi, 



because either 

Tfc [X TTl] ^ Tn 

or 

Tl [x TTfe] ^ Tn- 

This immediately implies that determines infinitely many equivalence 
classes, since we have infinitely many pairwise non-equivalent terms. Therefore 
the set of proofs of the Lambek calculus is not regular. I 

Corollary 1. The set of normal form proofs of the Lambek calculus is not reg- 
ular. 

Proof. In the above construction, choose to be a proof with n-\-l uncancelled 
assumptions and with no occurrence of either [\/] or [//]. Substituting 7 t„ for x 
in Tn (x) gives a normal form proof tree, since there are no introductions followed 
by eliminations. Again we arrive at infinitely many equivalence classes, showing 
that the set is not regular. I 

However, in the above proofs we considered the set of all proof trees of the 
Lambek calculus. The complexity of proof trees of grammars based on the Lam- 
bek calculus, which only use a subset of the set of all proof trees, depends on 
whether we consider all the possible proof trees that a grammar can assign to the 
strings in its language or only those in normal form. The approach taken both 
in traditional linguistics and in formal language theory is to consider any two 
derivations of a string as different as long as they are not identical. However, 
we could also consider two derivations as identical if they are /J-ry-equivalent, 
mainly for semantic reasons 0 Such a restriction can change the complexity of 
the structures assigned by a grammar, as the following theorem illustrates. 

Theorem 4. There is a Lambek grammar G such that its set of non-normal 
proof trees is not regular, although its set of normal proof trees is regular. 

® See Wansing (1993) for a definition of /3-??-equivalence in the context of the Lambek 
calculus and van Benthem (1995) for a discussion of using normal proof trees as 
structural representations. 
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Proof. Consider the grammar G which generates the regular (string) language 



a : S, 

S/S. 

The normal trees of G are regular using the construction in proposition Q and 
the fact that normal proof trees of this grammar do not use introduction rules 
(see below). To show that the set of all proof trees of G are not regular, we use 
the Myhill-Nerode theorem and construct a set of contexts 

T = {Tn{x) I n e N} 



and a set of trees 

7T = {7T„ I n e N}, 

such that for all n, m, such that n m, 

T^n '^G TTm, 

where denotes the Myhill-Nerode equivalence relation of the proof tree lan- 
guage of G. 

We construct T by 

Tr^ix) = r{x)[x ^ [\E]{[ID], [\if]([/i^], . . . , [\E]{[IDU[\irx))m 

with the number of [\if]’s matching the number of [\d]’s, and II by 
^, = [/E]{[IDl[/E]{[IDUlD])) 

TTn+l = [/£']([/£>], 7T„). 



First, notice that, by the construction, for all n, 

Tn{x)[x !-)■ 7T„] 

is a proof of the grammar, but for all k < n, 

T„{X)[X TTfc] 

is not, since iTk does not contain enough uncanceled assumptions. Therefore, for 
all k, I, such that k ^ I, 

TTfc '>^G 

which completes the proof. I 

It should be noted that the above construction does not preserve normal 
forms of proofs. Thus, it is a natural question to ask what the tree automata 
theoretic complexity of normal form proofs of a grammar is. This perspective 
brings with it an interesting approach to strong generative capacity, disregarding 
all derivations that can be counted as equal to some normal form proof tree. 
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The first observation is that there are Lambek grammars whose normal form 
proof trees are regular. The set of normal form proofs that can be used by a Lam- 
bek grammar that generates a finite language has to be finite, and therefore regu- 
lar, by van Benthem’s finiteness theorem (cf. van Benthem, 1995). Furthermore, 
any Lambek grammar that only assigns categories of the form A, A/ B ,{A/ B) /C, 
with A, B, C atomic, has normal-form proof trees that do not use any introduc- 
tion rules, which can be established by an easy induction on the height of the 
normal form proof tree. These proof tree languages are therefore equivalent to 
the proof tree languages of basic categorial grammars assigning the same cate- 
gories. Since the proof tree languages of basic categorial languages are regular, 
this is another example of a regular subset of the set of all normal proof tree 
languages of Lambek grammars. 

However, there are examples of normal form proof tree languages that are 
not regular. 

Theorem 5. There is a Lambek grammar such that its set of normal form proof 
trees is not regular. 

Proof. Consider the following grammar that generates the regular (string) lan- 
guage a+: 



a : S, 

S/{A/A), 

S/{S/{A/A)). 

The following proof trees are normal-form derivation trees for the strings a, aa, 
aaa, respectively: 



S 



S/{S/{A/A)) S/{A/A) 
S 



/E 



[A/A] [H] 



[A/A] 



A 



/E 



S/{A/A) 



A 

A/ A 



S 



S/{S/{A/A)) 



S/{A/A) 



S/{S/{A/A)) 



S/{A/A) 



S 



/I 

/E 



/I 

/E 



/I 

/E 



/E 



In general, normal-form proof trees of this grammar for strings of length 
n > 2 have the following form: 
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— they have a top part, consisting of n — 1 occurrences of Aj A\ 



[A/A] [A] 

A 



/E 



[A/A] 

A 



A 



/E 



— a middle part: 

S/(A/A) A/A (I 
5 

— and a bottom part, where the number of [//] matches the number of A/ A 
in the top part: 



S/{S/{A/A)) S/{A/A) 
S 



/I 

/E 



S/{S/{A/A)) 

S 



S/{A/A) 



From this observation it is straightforward to conclude that the set of normal 
form proof trees is not regular. We can again construct an infinite set of contexts 

{tu (a;) I n G N} 

defined by 

TO {x) = [/E] (c, [//] {[/E] (c, [//] {[/E] (c, [//] (x)))))) 
r„+i {x) = \/E] (c, [//] (r„ (x))) 

and an infinite set of trees 

{7T„ (x) I n G N} 

defined by 



7To = [/E] (c, [/E] (c,c)) 
7T„+1 = [/E] (c,7T„) 



such that 



Tk (x) [x a] 



is a normal-form proof tree that is used to establish that some string is generated 
by the grammar iff cr = It should be noted that these proofs are in normal- 
form, although they follow introductions by eliminations, however, the formula 
that has been introduced is the minor premise of the elimination rule. I 
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Finally, we need to revisit the question of structural completeness. As was 
pointed out above, Lambek grammars that assign formulas of the form A, A/B, 
(A/B) jC, with A,B,C atomic, have regular proof tree languages. It is known 
that as a consequence of Gaifman’s theorem every categorial grammar is equiv- 
alent to one that assigns formulas of the above form. Since basic categorial 
grammars are not structurally complete, we can conclude: 

Theorem 6. For every context-free language, there is a Lambek grammar that 
is not structurally complete which generates it, assuming that structures that are 
assigned are normal form proof trees. 

Proof. This follows immediately from Pentus’ theorem and the discussion in 
the preceding paragraph. I 

5 Conclusion 

The main conclusion is that using normal form proof trees as structural descrip- 
tions that Lambek grammars assign to strings can extend the strong generative 
capacity of context-free grammars, but does not trivialize the notion of strong 
generative capacity. Furthermore, the use of proof trees as structural descriptions 
makes it possible to use structural descriptions for a compositional semantics, 
based on the Curry-Howard isomorphism, which is impossible if structures are 
used that disregard the use of introducton rules. 

One of the topics for further research on strong generative capacity is to 
consider proof theoretic grammars based on extensions of the Lambek calcu- 
lus, including those with additional additive or multiplicative connectives and 
modalized versions of the Lambek calculus. Some results in the case of modalized 
Lambek grammars with one residuation modality have been obtained by Jaeger 
(1998). In the case of modalized Lambek grammars with a variety of modalities 
and interaction postulates between them, there is one obstacle: Carpenter (1999) 
has shown that modal Lambek grammars can generate any recursively enumer- 
able language. Thus, their normal form proof trees extend even the context-free 
tree languages, as the yield of any context-free tree language is an indexed lan- 
guage. Thus, it would be necessary to restrict modal Lambek grammars in such 
a way that they can only generate (at most) indexed languages or subclasses 
of the indexed languages. However, it appears to be an open problem how to 
restrict modal Lambek grammars in such a way. 

Another question that would naturally fall into the study of the strong gen- 
erative capacity of proof theoretic grammars is: given that Lambek grammars 
extend the strong generative capacity of context-free languages but generate only 
context-free string languages, what decidable properties of the strong generative 
capacity of context-free grammars are also decidable for Lambek grammars? For 
example the following properties of context-free grammars are decidable: 

— strong equivalence of two context-free grammars: do two context-free gram- 
mars assign the same structures to the strings they generate? 
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— ambiguity: does a context-free grammar assign different structures to some 
string it generates? 

Other questions that relate to this area are what kind of logic would be needed- 
for a descriptive approach to proof trees of Lambek grammars, following Rogers 
(1998) work, and the relationship of proof trees to those of tree adjoining gram- 
mars, which is related to work by Joshi and associates (cf. the contribution by 
Joshi, Kulick, and Kurtonina to this volume). 
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Abstract. Linear indexed grammars (LIGs) can be used to describe 
nonlocal dependencies. The indexing mechanism, however, can only ac- 
connt for dependencies that are nested. In natnral languages one can eas- 
ily find examples to which this simple model cannot be applied straight- 
forwardly. In this paper I will show that a formalism fitting better to 
linguistic structures can be obtained by nsing a sequence of pushdowns 
instead of one pushdown for the storage of the indices in a derivation. 
Grucially, we have to avoid unwanted interactions between the push- 
downs that would make possible the simulation of a luring machine. 0 
solves this problem for multi-pushdown automata by restricting reading 
to the first nonempty pushdown. I will argue that the corresponding re- 
striction on writing is more natural from a linguistic point of view. I will 
show that, under each of both restrictions, grammars with a sequence of 
n pushdowns give rise to a subclass of the nth member of the hierarchy 
defined by 1 1 l hj . and therefore are mildly context sensitive. 



1 Introduction 

Nonlocal dependencies in natural languages can be described with linear indexed 
grammars (LIGs) (E)The indexing mechanism can only account for dependen- 
cies that are nested. This is due to the fact that the indices are stored as on a 
pushdown: the index introduced first is read last. In human languages one can 
easily find counterexamples to this simple model. Thus, the obvious step is to in- 
vestigate similar formalisms using another way to store indices. Up to now these 
possibilities have scarcely been explored. P] introduces multiset valued LIGs, 
that are not only an extension of LIGs w.r.t. the structure of the index stack. 

gives extensions using pushdowns of pushdowns, which, however, seem to 
have no direct linguistic interpretation. 

According to a number of recent linguistic theories we have to distinguish 
between different types of movement such as movement of heads, movement 
into argument positions and movement into other positions (UDI) or between 
types determined by the feature that licenses the movement (cf. |2E1)- Further 
evidence for distinguishing between different types of movement comes from psy- 
cholinguistic research. It can be observed that the cognitive difficulty to process a 
sentence does not directly depend on the number of dependencies that have to be 
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computed (at the same time) but rather on the number of similar dependencies 

(cf. jlSKp . 

These ideas suggest a formalization as a kind of LIG with more than one 
pushdown of indices in a derivation. Such grammars would, however, generate 
all recursive enumerable sets, as is easy to verify. A restriction that makes a 
system with more than one pushdown less powerful is proposed in • Here it is 
shown that automata with a sequence of pushdowns, restricted by the condition 
that a pushdown only can be read if all pushdowns to its left are empty, accept 
only context sensitive languages parsable in polynomial time. As I show below, 
this result can be carried over to grammars with similar storages for indices. I 
will define these grammar formalisms and give a characterization of the classes of 
generated languages in terms of controlled grammars. This results in an elegant 
new definition of the hierarchy established in PJ. It can be argued that the 
used restriction on the accessibility of the pushdowns is not plausible from a 
linguistic point of view. A similar condition restricting writing instead of reading 
seems much better suited. Using this restriction I will define a new hierarchy of 
languages. I will show that each class of languages of the hierarchy defined in 
m as well as each class of the new hierarchy is a subset of a class from the 
hierarchy defined in m- Since all languages of this hierarchy were shown to 
be mildly context sensitive, the languages generated by LIGs extended with a 
restricted sequence of pushdowns are mildly context sensitive as well. 

2 Grammars with Storage 

In order to study grammars with additional memory, it is useful to define storages 
independently of the grammars or automata using them. 

Definition 1 . A storage type is a 6-tuple S = {C,Co,Cf,P, F,m), where C 
is a set of configurations, Cq C C and Cp C C the sets of initial and final 
configurations, respectively, P is a set of predicate symbols, F a set of instruc- 
tion symbols, m is the meaning function, which associates every p £ P with 
a mapping m(p) : C — >■ {true, false} and every f £ F with a partial function 

The trivial storage type ^tHv is defined by S'triv = ({c}, |c}, |c}, 0 , {id}, to), 
where c is an arbitrary object and m{id){c) = c. An (ordinary) pushdown ^pd 
over some finite alphabet F is defined by S'pd = (P*, {e}, {e}, P, P, to) with 
P = {top = 7I7 £ F} and P = {push(7) | 7 £ P} U {pop} U {id} and for every 
a £ F and 13 £ F*, 

m(top = 7 )(o/ 3 ) = (a = 7) m{pop){af3) = (3 

m{push{j)){a(3) = 70 /? m{\d){af3) = af3 

For the sake of convenience I will sometimes assume that there is an additional 
predicate empty which is true if the stack is empty and false otherwise. This 
predicate is in some sense superfluous since we could use a “bottom-of-stack” 
symbol and test whether this symbol is on top of the stack. 
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On the basis of this universal definition of storages it is possible to make a 
generalization of LIGs w.r.t. the organization of the indices. These generalized 
grammars, called context free linear ^'-grammars, were introduced in ini 

Definition 2. If S' = {C,Co,Cf, P, F,m) is a storage type, then a context 
free linear-S -grammar (CFL-S-G) is a five-tuple G = {N, E, R, cq), where 
N, E denote the sets of nonterminal and terminal symbols, resp. Ain G N is a 
distinguished symbol, called the start symbol, cq G Cq is the initial configuration, 
and i? is a set of production rules of one of the following two forms: 

A — if 7T then CiPfC 2 A — if tt then w 

where A G N, B G N called the distinguished symbol, n G BE(P)) and Ci>C 2 G 
(A^ur)*, f gF,wGE*. 

A pair of symbols from NxC is called an object. A string cr G {{NxC) U A)* is 
called a sentential form. A sentential form a is said to derive a sentential form 
r, written cr =g- r, if either (1) or (2). 

(1) a = a{A, c)j3 (2) a = a(A, c)/3 

A — if 7T then C 1 -S/C 2 G R A — >■ if tt then w G R 

to(7t)(c) = true m(7r)(c) = true 

to(/) is defined on c r = awP 

T = aQ[{B,d)C'2l3 

where A,B G N,a,(3 G {{NxC)AE)* , w S E*, cG C,c' = m{f){c) and Ci) C 2 
obtained from Ci and C2 respectively by replacing every nonterminal D by {D , cq) . 
The reflexive and transitive closure of =§■, denoted by =g>*, is defined as usual. The 
language generated by G is defined as L{G) = {w G lA'KAinjCo) =§■* w}. The 
class of languages generated by GFL-^-Gs is denoted by £cfl(S)- Obviously, 
GFL-5'pd-Gs correspond to normal LIGs. With respect to the corresponding 
occurrences, the object {B,c') is called the distinguished child of (A,c). Given a 
derivation let s = (Ai, ci) . . . (A^, c„) with n G IN be some sequence of objects 
such that each (Ai+i,Ci+i) is the distinguished child of (Ai,Ci) with 1 < i < n. 
Then s is called the spine from (Ai,ci) to (A„,c„) and (A„,c„) is called a 
distinguished descendent of (Ai,ci). 

A language generated by an GFL-S'-G can be characterized in terms of a 
controlled grammar and an S'-automaton. This will become important when we 
investigate languages generated by grammars using more than one pushdown. 

Definitions. If S' = {G,Go,Cf, P, F,m) is a storage type, then an S- 
automaton M is a tuple (Q, E, 5, qo, cq, Qf), where Q is a finite set of states, E 
is the input alphabet, qo G Q the initial state, cq G Gq the initial configuration, 
Qf Q the set of final states, and 5, the transition relation, a finite subset of 
Q X E,^ X BE(P) X Q X F, where BE(P) denotes the set of boolean expressions 
over 



^ The empty string is denoted ny e. For each set V the notation K is used as an 
abbreviation for V U {e} and ifV denotes the number of elements of V 
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The set ID(M) = Q x E* x C is called the set of instantaneous descriptions. For 
each (qi,xw,ci), (q 2 ,w,C 2 ) € ID(M) with x € we write (gi, ci) \~m 
(g 2 )tc, C 2 ) if there exists {qi,x,T:,q 2 , f) € <5 such that to(7t)(ci) = true0 
m{f) is defined on ci and m(/)(ci) = C 2 - The transitive and reflexive clo- 
sure of \~M is deflned as usual. The language accepted by M is deflned 
L{M) = {w I (go, ttJ, Co) \-*M (q,i,c) for some c G C and g S Qf} if Qf ^ 0 
and L(M) = {w|(go,w,co) \~m (gi,w',ci) \~m .■■{qn,e,Cn) for some c„ S Cf, 
qnGQ, CiGC— Cf and qi G Q with 1 < i < n} otherwise. In the first case 
we say that M accepts by final state. In the second case M accepts by final 
conflguration. We denote by £c(5') and S,q{S) the classes of languages accepted 
by S'-automata accepting by final conflguration and ^-automata accepting by 
flnal state, respectively. If S.q{S) = S.c{S) and #Co = #Ci = I, then we call 
S well-formed. In this case we can drop the subscripts and write S-{S). If more- 
over Co = Cf, then we say that S is concatenating. This will turn out to be 
an important property if composition of storage types is considered. A typical 
example of an concatenating storage type is provided by S'pd. 

Linear control of context free grammars (CFGs) is deflned in [IS|. The defini- 
tion is twofold. The first part says which paths have to be controlled; the second 
part defines the control itself. 

Definition 4. A linear distinguished grammar (LDG) is a quadruple G = 
{N, E, R, Ain), where N, E and Ai„ (the start symbol) are interpreted as in a nor- 
mal CFG and where i? is a finite set of production rules of the form: A — >■ /3iA!/32 
with A G N, X G N U E, called the distinguished symbol, f3i,f32 G {N U E)*, 
and ! a special symbol not in {NU E). A Linear Controlled Grammar (LCG) is 
a pair K = {G,H), where G is a LDG and H is a language over R, called the 
control language. 

A string a G ((X U E) x R*)* is called a sentential form. A sentential form a is 
said to derive a sentential form r, written a t, ii 

a = y(A, uj)5 
r = A^ PiX\P2 G R 
T = 'yf3'i(X,u;r)P'2S 

where A G N , X G NUE, f3\, P 2 G {NUE)*, "f,S G {{NUE) x R*)*, uj G R*, and 
f3'i and are obtained from (3i and P 2 resp. by replacing every symbol Y G NUE 
by {Y, e). The reflexive and transitive closure of denoted by =g-*, is defined as 
usual. The language generated by K is defined as L{K) = {ai 02 . . . an|(5', e) =^* 
{ai,oji) ( 02 , 102 ) ... (an,0Jn) and Oi G E,oji G H for 1 < z < n}. Let 0 ld be 
the class of all LDGs. For any class of grammars © for which control is defined 
let ©/£ = {(G,H) I G G © and H G £,} and for any class of grammars © 
let L(©) = {L(K) I K G ©}. Regarding each element of (N U E) x R* as 
an object, the definitions of spines and distinguished descendents carry over 
straightforwardly to LCGs. 

^ In fact only m(7r) for tv G P has been defined so far. It is, however, straightforward 
to extend the domain of m to BE(P). 



270 



C. Wartena 



We can think of a CFL-^-grammar as an LDG controlled by (the language 
accepted by) an S'-automaton but in which the control word for each spine is 
computed on the fly. On the other hand, it is easy to imagine how a CFL-S'-G 
can be split up into an LDG and an 5'-automaton generating the correct set of 
control words. The relation between S'-grammars, iF-automata and controlled 
grammars was established for linear grammars (i.e. grammars with only one 
nonterminal in the right-hand side of each rule) by |14|. The more general case 
of GFL-^-Gs and LDGs is captured in the following theorem: 

Theorem 1. L(0 ld/'Cq(>5')) = •Ccfl(S) 

Proof. “C”: Let K = {G,L{M)) be a LGG for some LDG G = (fV, T, i?, A;„) 
and some S'-automaton M = (Q, i?, i5, goj cq, Qf) with S = {G,Gq,Gf, P, F,m) 
an arbitrary storage type. Gonstruct a GFL-^-G G' = {N' , S , R' , A[^, cq) as 
follows. 

N' = {(X, q)\X G NU S and q G Q} 

^in “ (^inj 9o) 

R' = {(A?) -t if 7 t then /3((X,p)//3^ | 

r = A ^ PiX\P 2 G R and {q, r, n,p,f) G 5} 

U {(-’f,?) -t if 7 t then {X,p)f \ 

X G N U S and {q, e, n,p, /) G (5} 

U {(a, g) — >■ o I a G S and q G Qf} 

where /3( and /?2 are obtained from /3i and /?2 respectively by replacing every 
symbol Y G X U X by (Y, qg) . 

It is easy to verify that the configurations of the storage of M during a com- 
putation correspond to the associated storage configurations of the nonterminals 
on one spine (in G), and that the word accepted by M is the the string of pro- 
ductions rules (from G') that are used to expand the nonterminals on the spine. 
Gonsequently, this sequence of rules has to be a word in L{M). Now it can be 
shown by induction that 

(-4,e) v'{B,uj)w' and {qi,uj,Ci) (q 2 ,e,C 2 ) 
iff 

((A,gi),ci) v{{B,q 2 ),C 2 )w 

where (B,uj) and {{B, 92 ), C 2 ) are distinguished children of {A, e) and ((A, qi), ci) 
resp. and with A,B G 9 i ,92 G Q, ci,C 2 G G, v,w G E*, uj G R* and 
vfw' G ExR* the respective corresponding counterparts of v and w. 

“D”: For some S = {G,Gq,Gf, P, F,m) let G = {N, E, R, Ain,co) be a GFL- 
5'-G. Gonstruct an LDG G' = {N, E, R' , A{^), where A — >• (3iB\P2 G R' if 
A if 7 T then — >■ PiBff32 G R and A — >■ h\w G i?' if A if tt then bw G R 
with A, B G N, f G F Pi, P 2 G (N U E)*, b G E, f G F and w G E*. Further 
construct an S'-automaton M = {N U {^f}, R', S, Fljn, cq, { 9 f}), by setting 

S = {(A, r, tt,B, f) \ r = {A if tt then — >• PiBfP 2 ) G R} 

U {(T, r, TT, qF, id) I r = {A if tt then — >• ic) G i?} 
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Fig. 1. Example of a possible configuration of a (S'pd°rSpd)°r>S'pd storage with marking 
of possibilities for reading and writing. 

where id denotes identity, the operation that does not change the configuration 
of the storage, assuming w.l.o.g. that the storage type S has an identity. It can 
be shown by induction that 



{A, Cl) v(B,C2)W 

iff 

{A,c) v'{B,u)w' and {A,uj,ci) (S,e,C2) 

where {B, C2) and {B, uj) are distinguished children of {A, Ci) and {A, e) resp. and 
with A,BgN, qi,q2 £ Q, c\,C2 G C, v,w G S*, u> G R* and v' ,w' G S x R* 
the respective corresponding counterparts of v and w. □ 



3 Sequences of Pushdowns 

As noted above, grammars or automata using two pushdowns seem useful for the 
description of natural languages but are too powerful. Automata with a kind of 
storage that is in some respects similar to a sequence of pushdowns but less pow- 
erful are defined in [Q. These automata are called multi-pushdown automata. In 
fact the storage a multi-pushdown automaton uses can be considered as a tuple 
of pushdowns if writing is concerned, but with respect to reading the storage 
behaves like a single pushdown. A possible configuration of such a storage type 
is shown in Figure H In order to give a simple definition of this kind of storage 
types independently of the automata using them, I will define a concatenation 
operator on storage types. Applied to pushdowns it yields the storage types 
actually used in f[J. 

Definition 5. For any storage type = {C^ ,Cq,Cp, ,m^) and = 

(C^, Cq , , CpP^, F"^, m?) with F^ = F^ U F^, where F^ denotes the set of read- 
ing instructions and F^ the set of writing instructions, the concatenation w.r.t. 
reading of and is defined as = (C, Co, Cp, P, F, m) where 

C = C^ X C 2 Co = Cl xCl P = pi U {test(p) | p £ P^} 

Cp = C^pxCl P = pi U {do(/) I / £ P 2 } 

and for every ci £ Ci, £ C^ 
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"i(p)((c\c^)) = m^(p)(c^) 

"i(/)((c\c^)) = (m^(/)(c^),c^) 
m(test(p))((c^, c^)) = rn^{p){(?) 
m(do(/))((c\c2)) = {c\m\f){c^)) if f G 
w(do(/))((c\c^)) = (c\m^(/)(c^)) if f € F'^ and S C7], 

For the pushdown type let the instruction push be in F^ and all other instruc- 
tions in Fr- Note that m(do(/))((c^, c^)) is undefined for / G if ^ Cp. If 5^ 
is a pushdown this means that reading and popping from the second component 
is only possible if contains only the empty string. Thus a concatenation of 
pushdowns defined in this way corresponds to the storage type used in P|. An 
important property of concatenated storage types that is easy to verify is that 
is well-formed and non-concatenating if and S'^ are. 

The languages that are accepted by an S S'pd-automaton can be charac- 
terized by linear control of an LDG restricted by a condition defined for LIGs 
in [S|, namely that the distinguished symbol in a rule is always the leftmost 
nonterminal daughter. 



Definition 6. Let G = {N, S, R, Ajn) be an LDG. G is called an extended left 
LDG (ELLDG) if each production is of one of the following two forms: A — >■ aB\G 
or A — a!, where A,B^N,C€N^ and a G If each production is of the 
form A — >■ BG\a or A — >■ a!, G is called an extended right LDG (ERLDG). The 
class of ELLDGs (ERLDGs) is denoted by ©elld (©erld)- 

The same restrictions can be applied to GFL-S'-Gs. We say that a GFL-S”- 
G G = (iV, A, i?, Ain, Co) is an extended left linear S Grammar (ELL-S'-G) if 
each production is of one of the following two forms: A — aBfC or A — >■ a, 
where A, B G N, G G a G and / an instruction symbol from S. If each 
production is of the form A — >■ BG fa or A — >■ a, G is called an extended right 
linear S Grammar (ERL-F-G). By applying the construction from the proof of 
Theorem 0 to the grammars with the restricted rule format it is easy to verify 
that the classes of languages generated by ELL-S'-Gs (ERL-S'-Gs) and ELLDGs 
(ERLDGs) controlled by S'-automata are equivalent. 

Now I will show that the class of languages generated by ELLDGs controlled 
by languages accepted by an S'-automaton is identical to the class of languages 
accepted by (S' Spd )-automata. Since controlled ELLDGs and ELL-S-Gs are 
equivalent, we can use the latter just as well for the proof that will a bit shorter 
in this case. The idea for the simulation of the controlled grammar is to use 
the second component of the storage of the automaton to store the grammar 
symbols simulating a leftmost derivation. The first component can correspond 
to the storage of the grammar. Each time a symbol of the control word, i.e. a 
rule of the grammar, is produced, this rule is carried out: the initial terminal 
is read from the input, the first nonterminal is coded in a new state and all 
other symbols are pushed to the second component of the store. As long as 
the expansion of distinguished symbols is controlled, reading from the second 
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component is not possible and not necessary since the controlled path leads from 
leftmost nonterminal to leftmost nonterminal daughter. Since each control word 
has to be computed on the first component of the storage, the simulation is only 
possible if the configuration of that component after recognizing a control word 
is again the initial configuration. That is to say that S has to be concatenating. 



Lemma 1. For each concatenating storage type S the following holds: 

£elld/£(^) C£(5o,S'p,i) 

Proof Let G = {N, S, R, Ain, cq) be an ELL-S'-G for some concatenating storage 
S = {C,Co,CF,P,F,m). Construct the required S Or S'pd-automaton M = {N^x 
Ne, s, 6, Ain, (co, e), 0) by setting 

S={{{A,f^),a,Tr,{Bi,B2),f)\ ( 1 ) 

A — >■ if 7T then ai?i/i ?2 € R} 

U {((^,S),a,7T, (^,e),do(push(B)) I A,BgN} (2) 

U {((Tl,e),a,7r, (e,e),id) I (3) 

A — >■ if 7T then a £ R} 

U {((e,e),e,test(top = A),(^,e),do(pop)) I A £ N} (4) 

It can be shown by induction that 

{{A, e),x, (a, e)) ((e, e), e, (c^, e)) iff {A, a) a; 

In case A = Ain and ct = cq we see that M accepts exactly the languages that 
are generated by G. □ 

The idea behind the construction of an ELL-S'-G simulating an S Oj, Spd~ 
automaton is again straightforward. The underlying CFG is used to code the 
configurations of the pushdown and the additional storage of type S makes 
the same transitions as the corresponding component of the automaton. The 
computation of a control word and hence the simulation of the first component 
of the S On Spd storage has to stop as soon as the foot of a spine in the ELL-S'- 
G is reached. This, however, is unproblematic since this point corresponds to a 
situation in which the automaton will pop an element from the second component 
and the first component has to be in a final configuartion, so to say accepting the 
control word. The subsequent configurations of the first component are simulated 
along a new spine which, according to the definition of LCGs, starts with cq. 
Thus, the final configuration that was reached by the first component has to be 
Co, which is the case if S is concatenating. 



Lemma 2. For each concatenating storage type S the following holds: 



C£elld/£(^) 
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Proof. Let M = (Q, 27, 5, go, cq, 0) be an S Or S'pd-automaton for some con- 
catenating storage type S = {C,Co,Cf, P, F,m). Assume w.l.o.g. that the 
only boolean expression applied to the pushdown is top = 7 and that this 
predicate is used only in combination with popping. Construct an ELL-S'-G 
G = {{Qxr^xQ)UAin, 27, R, Ai„, cq) with P the stack alphabet of the pushdown 
component Spd, A-^ a new symbol and 

R = {Ain — >■ {qo, ^,q) \ q ^ Q} (i) 

u {(gi, A,ga) -)> if 7 T then a(g 2 , A,g 3 )(/) I ( 2 ) 

(gi,a, 7 T,g 2 ,/) e S, 

TT e BE(P), / e F U {do(id)}, A e P^ and ga S Q} 

U {(gi, A,g 4 ) if 7 T then a(g 2 ,S,g 3 )(id)(g 3 , A,g 4 ) I (3) 

(gi,a, 7 T, g 2 ,do(push(F))) e S, 

Ag Pe and ga,g 4 G Q} 

U {(gi. A, g 2 ) — ?> if empty then a \ (4) 

(gi,a,test(top = A), g 2 , do(pop)) G <5, 

A G P^ and gi G Q} 

To establish that L{G) = L{M) it can be shown by induction on the number of 
steps in the derivation of G that 

{{P, A q),A =^* X iff {p, X, (cr. A)) (g, e, (c^, e)) □ 

Theorem 2. Por each concatenating storage type S the following holds. 

£(5o,Fpd) = T(0ELLD/£(5)) □ 

The language classes of the hierarchy established in [Q can in our notation 
be defined for each f G IN as = £,{Si) where Sq = Stm and Si = Si-i Spd. 
According to Theorem |3 this hierarchy can be defined as well as £0 = £rOT 
and = L(0elld/£j-i), where £reg denotes the class of regular languageo 
Given this representation, it is possible to compare this hierarchy to Weir ’s(IEl) 
hierarchy of mildly context sensitive languages that is defined as £q = £reg and 
£i = L(0LD/£i-i)- Since each ELLDG is an LDG, we can conclude that each 
class of the hierarchy of concatenated pushdowns (cf. Table is mildly context 
sensitive too. Finally, it follows immediately that the language generated by some 
£0-5-0, with S a concatenation of pushdowns, is mildly context sensitive. This 
is the result which in fact allows us to use these grammars for linguistic purposes. 

The inclusion of the multiple-pushdown languages in the classes from Weir’s 
hierarchy can be intuitively understood in another way, considering the storage 
types defined by ca Take the automata accepting languages of the second class 
in Weir’s hierarchy as an example. These automata use (linear) pushdowns of 
pushdowns. The allowed operations include not only replacement of the topmost 
symbol of the topmost pushdown by a sequence of symbols, but also replacing the 
entire topmost pushdown by a sequence of pushdowns including the original one. 
This means in fact that we can either write on the top (symbols or pushdowns) 



^ Note that Theorem El holds for S'triv though it is not well-formed. 
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or below the topmost pushdown (only pushdowns). Thus, in some sense these 
automata too incorporate the idea of having several possibilities for writing but 
only one for reading. 



4 Linguistic Relevance 

In the previous section I have sketched a possible way of defining grammars with 
more than one pushdown for storing information that is needed to compute non- 
local dependencies. Now we might ask how well this model fits empirical data 
and the linguistic theories that I have alleged as a motivation for the use of 
multiple-pushdowns . 

In Rizzi’s (im) theory of relativized minimality the distinction of different 
types of nonlocal dependencies plays an important role in the formulation of 
constraints like minimality. Minimality requires in principle that each trace is 
antecedent -governed by the closest (c-commanding) antecedent available. Rizzi 
distinguishes three types of antecedent-government: A-government (from ar- 
gument positions), A’-government (from non-argument positions) and head- 
government. Minimality must only be respected among relations of the same 
type, hence relativized minimality. This means that no potential antecedent of a 
certain type can fall properly between a trace and its governor of the same type. 
On the one hand, non-local dependencies are in this way divided into three 
categories in the way we assumed. On the other hand, we do not need three 
pushdowns to account for these. In fact, the relativized minimality requirement 
entails that there can be at most three overlapping paths, and a finite storage 
would suffice for formalization in terms of ^'-grammars (cf. El)- There is how- 
ever a possibility to circumvent minimality. Some A’-traces do not have to be 
antecedent governed, but only have to be bound by their respective antecedents. 
For binding, however, no minimality condition is formulated. 

Minimalist theories like those sketched in ^ capture (relativized) minimal- 
ity effects with the minimal link or shortest movement condition. It is assumed 
that movement is always triggered by the need to check a feature. Basically, 
features can be divided into licensing features (licensors) and features that have 
to be licensed (licensees). The verb (or a functional projection of the verb) e.g. 
has features that license the case features on the arguments. If some licensee 
is not in a local relation to an appropriate licensor, the latter can attract the 
phrase bearing the feature that has to be licensed. A licensor can, however, only 
attract the closest appropriate licensee available. The different types of move- 
ment are distinguished by the features that are checked in the target position, 
as far as A- and A’-movement are concerned. Movement of heads is not fur- 
ther split up in this way. The minimalist program sketches several possibilities 
of relaxing the minimality condition. The first way comes about along with the 
creation of complex categories by means of Chomsky-adjunction. The typical 
example is successive cyclic head movement: a head moves to the next head and 
adjoins to it. Further movement of the complex head is regarded as one move- 
ment step though two new nonlocal dependencies are created. This process can 
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be carried on recursively. Thus it is possible to create an unbounded number of 
local dependencies. Since successive cyclic head movement always creates nested 
dependencies, a formalization using a pushdown seems to be fine. The second 
possibility indicated by Chomsky to create an unbounded number of nonlocal 
dependencies arises if we allow a head to have multiple specifiers and an at- 
tracting feature to escape erasure after checking. After raising a phrase to the 
specifier of the attracting head, the next more deeply embedded phrase that can 
check the same feature becomes the closest candidate for moving. Due to the 
extension principle it has to be placed in the next higher specifier. Consequently 
the dependencies created in this way are nested. Since this process can in prin- 
ciple take place independently with regard to different features, our approach 
using a number of pushdowns fits very well. Finally, phrases can, under certain 
conditions, become equidistant. If two phrases are equidistant either can be at- 
tracted. Now movement can be crossing as well. Chomsky however argues that 
at most two phrases should be allowed to become equidistant. Thus, this is not 
a possibility to create an unbounded number of nonlocal dependencies. 

In some psycholinguistic theories about sentence processing it is argued that 
the human parser distinguishes between different types of relations and that the 
cognitive difficulty of a sentence is not determined by the number of relations 
that has to be computed but rather by the number of relations of the same type 
([DEI)- This leads Lewis to propose a model for sentence processing that in fact 
uses different stacks for different types of relations. For the human parser stacks 
might have an upper bound on the number of elements they can contain, typically 
two or three. As soon as more elements of one type have to be remembered 
the sentence will become too difficult to parse, but not ungrammatical. Thus 
Rado 0 explicitely argues that we need two pushdowns to compute nonlocal 
dependencies in Hungarian. One pushdown is used for topicalized phrases, the 
other for w/i-phrasesQ 

Finally, we have to ask what price must be payed for the restriction on reading 
from the pushdowns, that we had to impose on composite storage types in order 
to domesticate them. Assume that indices are pushed onto the storage when 
a moved element is encountered, while they are popped in the respective base 
positions. Suppose further that one pushdown is used for each type of movement. 
Now the restriction under consideration requires that all indices of one type 
must be read before the base positions of another type are found, such that the 
base positions have to be ordered while landing sites might be mixed. In other 
words we have to be able to locate a region from which all arguments originate, 
a region in which all w/i-phrases are generated etc. In natural languages, on 
the contrary, we find a clustering of the landing sites for each type. This can 
e.g. be observed in languages with multiple w/i-movement. The strict adjacency 
between the fronted w/i-phrases in these languages even leads H2| to analyze 
the w/i-phrases as a single constituent formed by adjunction. Similarly we find 
in Dutch constructions in which an arbitrary number of verbs can raise a strict 

To account for crossing dependencies that are possible to some degree she assumes 

that the topmost two elements of the ui/i-stack are visible. 
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adjacency between these verbs. The clustering of the landing sides chimes in with 
the minimalist program as well. As I have argued above, the only two possibilities 
for moving an unbounded number of phrases are constituted by successive cyclic 
movement and by movement into multiple specifiers of one head. In both cases 
the landing sites clearly have to be adjacent. 

In the following I will show that the restriction on reading in a concatenation 
of pushdowns can be replaced by a condition on writing. I will show that we 
obtain similar results with regard to the generative capacity in this case. 



5 A Linguistically Inspired Variant 

In the previous section I have argued that it is more appropriate to leave pop- 
ping that is needed in the base positions unrestrained and restrict the writing 
operations that have to be carried through while the displaced constituents are 
encountered, rather than the other way around. This results in the following 
alternative definition for the concatenation of storage types. 

Definition 7. For any storage type ,Cq,C}p, ,m^) and = 

(C^, Cq,Cp, P'^ ,F'^ ,im?) with U where F^ and F^ denote the sets 

of reading and writing instructions resp., the concatenation w.r.t. writing of 
and is defined as 5^ = (C, Co,Cp, P, F,m) where C,Co,Cp, P and F 

are defined as in the concatenation w.r.t. reading and m is defined for every 
S C^, c^ S as 

™(p)((c\c^)) = m^{p){c^) 

™(/)((c\c^)) = (mi(/)(c^),c^) 
m(test(p))((c^, c^)) = m^{p){c^) 

"i(do(/))((c\c^)) = (c\m^(/)(c^)) if f €FI and c^ e 
m(do(f))((c\c^)) = (c\m^(f)(c^)) zf f G F^ 

Concatenation w.r.t. writing of a storage type S and a pushdown involves control 
of extended right LDGs by the reversal of This is actually the reason 

why we need controlled grammars ina addition to F-grammars. Control by the 
reversal of a language cannot be expressed in terms of F-grammars, unless the 
control language is closed under reversal. This will, however, not be the case for 
the languages I will consider below. 

The idea for the simulation of a linear controlled ERLDG by a {S o^, Fpd)- 
automaton is to use the second component both for simulating a derivation and 
for storing the rules that are used. When a nonterminal is expanded the label 
and the right hand side of the rule are pushed onto the second component. 
As soon as the automaton starts computing the control word with the first 



® For a string w = aia2 . . . a„_ia„ the reversal of w is defined as anOn-i ■ ■ ■ 02 ^ 1 . 
For a language L, = {ui | £ L} and for a class of languages £, = {L \ 

G £}. 
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component, the second component can only be used for reading. The terminals 
stored there can be read from the input string and the rules now serve as input 
for the computation on the first storage. Thus it is clear that the word of rules 
recognized is the reversal of the sequence of applied rules. Nonterminals on the 
second component cannot be expanded until the first component is in a final 
configuration again, and, consequently, it has accepted the control word. Thus, 
a path can only be controlled if it keeps leading from a rightmost nonterminal 
to its mother. 



Lemma 3. Let S = {C,Cq, P, F,m) be a concatenating storage type. Then the 
following holds. 

Proof Let S = {C,Co,P,F,m) be a concatenating storage type and and let 
K = {G,L{M)^) be an LCG, with G = {N,S,R,Ai^) an ERLDG and M = 
{Q, R,S,qo,co, 9 ) an 5'-automaton. Gonstruct an S o.^ S'^^-automaton M' = 
(Q X (fV U U i? U {e}), if, i5', { qq , An), (cq, e), 0) such that L(M') = L{K) by 
setting 



= {((?. A^e.true, (go, e), do(push(/3r))) | r = A -)> /? G R} (1) 

U {((g,a),a,true, (g,e),id) | a G if} (2) 

U (92,e),/) I (gi,r,7T,g2,/) G 5} (3a) 

U {((gi,r),e,7T, {q2,r),f) \ (gi, e, tt, ga, /) G <5} (3&) 



U {((g, e), e, test(top X), (g, X), do(pop)) | X G iV U if U R, g G Qj (4) 

It can be shown by induction that 

((go, A.ai • ..aj, (co,e)) ((go,R),e, {co,bkrkbk-irk-i . ..biri)) 

and 

((gi,e),6fc6fe_i . . .6i, {ci,bkrkbk-irk-i ■ --biri)) ((g2,e),e, (c2,e)) 

iff 

{A,e) (ai,wi) . . . {aj,LVj){B,nr2 . . .rk){bk,e) . . . (6i,e) 

and 

(gi,rferfc_i ...ri,ci) (g2,e, C2) 

and 

LUi G L{M)^ for 1 < i < J 

with AGN,BGNL)S,LOiGR*,riGR and bi G S L) R (with 1 < i < k). 

If rkrk-i . . . ri G L{M) and B G S, such that a\ . . . ajBbk ... 61 G L{K) we 
have gi = go, c\ = cq and C2 is a final configuration. Gonsequently, the two parts 
of the computation in M' that exists according to the assertion can be connected 
using a transition of type (2) and we see that Oi . . . QjBbk . . .b\ G L{M') as well. 
If we have a computation in M' , it is easy to see that it can be split up in a way 
such that the assertion applies. Now the assertion guarantees that the accepted 
language is in L{K) as well. 
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Assuming that the assertion is true for some n G IN, for the first direction of the 
proof the interesting case is given by the following derivation 

{D,e){B,r){b,e) 

(ai,wi) . . . {aj,ujj){B,r){b, e) 

Assuming that uji G L{M)^ for 1 < z < j and {qi,r, ci) \~m (92, e, C2) we find for 
M': 

{{qo,A),ai ...Uj, (co,e)) \~m' (by definition) 

{{qo,e),ai...aj,{co,DBbr)) \~m' (by definition) 

{{qo, D),ai . . . Gj, (cq, Bbr)) (by induction hypothesis) 

{{q, e), e. (coBbr)) \~m' (by definition) 

{{qo,B),e, (co,br)) 

and 

((qi,e),b, (ci,br)) ((q2, e), e, (c2, e)) (by definition) 

The interesting case for the other direction is a computation starting with a 
transition introduced by ( 1 ): 

((' 7 o,^),ai . ..Uj, (co,e)) \~m' 

{{qo,e),ai . ..aj, (co,DBbr)) \~m' 

{{qo,D),ai . ..Qj, {co,Bbr) 

((< 7 o,e),e, (co,Bbr)) \~m' 

{{q,B),e, (co,br)) 

For some qi,q2 € Q and ci , C2 G C we can moreover assume that 
{{qi,e),bo,{ci,br)) \-Ij, ((q2,e),e,(c2,e)) 

From this we can conclude 

=g- 

(D, e){B, r){b, e) =g-* (by definition) 

(oi, wi) . . . {aj,ujj){B, r){b, e) (by induction) 

Furthermore we can infer from the induction hypothesis that Wj G 
L{M)^ for 1 < z < j and from the way M' was constructed {qi,r,ci) 
(92,e,C2) □ 

Given a {S S'pd)-automaton, the idea for the construction of an LCG is that 
the nonterminals of the controlled grammar broadly code the symbols of the 
second component and the states at which they were pushed and popped. Thus 
rightmost paths code, from a terminal towards the top, successive states at which 
popping and working on the first component takes place and, consequently, these 
paths have to be controlled. In the rules of the controlled grammar, terminal sym- 
bols that are read while the automaton works on the first storage component 
are added at the end of the right-hand side, behind the distinguished symbol, 
yielding the ERLDG rule format. Finally, we have to account for complete com- 
putations (especially computations beginning and ending with an initial state) 
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that are carried through between pushing two symbols to the second component. 
These are simulated by rewriting of triples coding the states corresponding to 
that computation and the last terminal symbol that was accepted before. 



Lemma 4. Let S = {C,Cq,Cf, P, F,m) be some eoneatenating storage type. 
Then the following holds. 

£{S Spd) Q L{(3ERLD/i^{S)^) 

Proof Let S = {C,Co,CF,P,F,rn) be a concatenating storage type and M = 
{Q, S, S, go, (co, e), 0) an (S'o^5'p£i)-automaton. Assume without loss of generality 
that the only boolean expression applied to the pushdown is top = 7 and that 
this predicate is used only in combination with popping. Construct an ERLDG 
G = {N, S, R, Ain) where Ajn is a new symbol and N C {Ain}U (QxT'x QxF') U 
{QxEx Q) with F the stack alphabet of the pushdown and F' = TU{#}, where 
# ^ r. Construct further an S'-automaton M' = (Q U {qq}, i?, c5', gg, cq, 0). R 
and 5' are determined by the followinjl: 

For each q G Q (1) 

^ “ ^in ^ ( 90 ) G R 

{q, r, true, q, id) G S' 

If {qi,a, TT, q 2 , f)GS with f G FU {do(id)} then (2) 

for each qx G Q, A G F, B G {A, e} and b G S 
ri = {qx, A, q 2 , B) {qx, A, qi,B)a G R 
T 2 = (gi, A, 92 , A) aG R 
rz = {qx, b, 92 ) -t {qx, b, qi)a G R 
ta = {qi,b,q 2 ) aG R 
{qi,{ri,r2,r3,rA},TT,q2,f) G S' 

(gg,e,true,gi,id) G S' 

If (gi,a,true,g 2 ,do(push(R))) G S then (3) 

for each qx,qy,qz G Q and A G F 

ri = {qx,A,q„,A) {qx, a, q2){q2, B , qy, e){qy. A, q„, A) G R 
T2 = (gx,A,g^,A) {qx,a,q2){q2, B,qx,e) G R 

rs = {qx, a, 92 ) -t {qx, a, qi)a G R 
ta = {qi,a,q2) a G R 
{qz,{ri,r 2 },true,qx,\d) G S' 

(9i,{»'3A4},true,g2,id) G S' 

(g^,e,true,gi,id) G S' 

If (gi,a,top = A, g 2 ,do(pop)) G S then (4) 

for each qx G Q and A G F} 
ri = {qx,A,q 2 ,e) -)> {qx,A,qi,A)a G R 
r 2 = {qi,A,qi,e) a G R 
(9i,{?'iA2},true,g2,id) G S' 

K),eArue,gi,id) G S' 

® For transitions I will use the notation {p, {ai, . . . , Un}, tt, q, f) as an abbreviation for 
{p, ai, 7T, q,f ),..., {p, fln, TT, q, f) 
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Now it can be shown that L{M) = L{G). 

“L{M) C L{Gy\ It can be shown by induction on n that 

{qi,ai . . . a^bk • • • &i, (ci, A)) {q^, e, {c2,B)) 

with B G {A, e} implies for some q2 G Q 



{{qi,A,q3,B),e) (oi.Wi) . . . {aj,u)j){bk,rir 2 ■ . . rfc)(6fc+i, e) . . . (6i,e) 

and 

{<loAk---ri,Ci)h* {q 3 ,e,C 2 ) 
and 

wf* G L{M) for 1 < i < j 

We start the induction with n = 1. In this case, if A = B the transition has 
involved some / G F U {do(id)}. If otherwise B = e then / = do (pop). In both 
cases we have j = 0 and fc = 1 since the a’s are generated only after pushing some 
symbol onto the second storage component. The grammar G' has the following 
rule, introduced by (2) if B = A and by (4) if i? = e. 

ri = (qi,A,q 3 ,B) -G bi 



For M' we find 

(9o>G,Ci)t, (gi,ri,ci) (92,6,02) 

Suppose the implication is true for some n G IN. Consider a computation of 
length n + 1. Suppose the first transition employs some f G F U {do(id)}. The 
computation has the following shape: 

(91, doi . . . Gjbk . ■ . bi,{ci,A)) 

(92,ai...aj6fe...&i,(c2,A))j^" 

(94,6, ( c 3 , B )) 

We distinguish two cases: the number of elements that was pushed onto the 
second component during the computation is zero (I.) or not zero (IF). 

By the induction hypothesis we know there is a derivation in G corresponding 
to the last n transitions. In case I. the last rule used must be of the type T2 intro- 
duced by (2) or (4), and all other applied rules have to be left linear; consequently 
the first terminal that is generated is a distinguished child of ((91, A, 93, F), e). 
By definition this is (6fc,ri ...r^). If the last rule was introduced by (2), the 
derivation is of the following form. If the last rule was introduced by (4) the 
situation is similar. 



((92,-4,94,F),e) 

((92,4l,93,A),rir2...rfc_i)(&fc_i,e) ...(6i,e) ^ 
{bk,riT2 ■ . .rfc)(6fe_i,e) . . . (&i,e) 



and for M' we find (93, ru ■ ■ ■ ?"i, C2) (93, rk ■ ■ ■ ri, C2) F* (94, e, C3). For all the 

left linear rules rewriting a nonterminal with 92 as its first component, there are 
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corresponding rules that have qi as the first component of their nonterminals 
and that are identical to the rules used in the above derivation up to this point. 
Thus, the grammar also licenses the derivation which we are looking for. 

{{qi,A,q 4 ,B),e) 

{{qi,A,q3,A),r[r'2...r'^._^){bk-i,e) ^ 

{{qi,A,q 2 ,A),r[r '2 . ..r'^){bk,e) . . . ( 6 i,e) ^ 

{d, r[r '2 . . . r'^r'^+ 4 ){bk, e) . . . (&i, e) 

It is easy to verify that M' can make the following computation 

(go,<+ir;...r'i,Ci) (g 2 ,r;+ir;...r;,ci) 

(93,r'fc...r'i,C2) h; (94,e,C3) 

In case II. we again know that there is a derivation in G corresponding to the last 
n transitions. Now some element has been pushed onto the second component 
of the storage. Thus, the leftmost derivation corresponding to the last n steps 
has the following shape. 

{{q2,A,q4,B),e) 

{{q2,x,q3),uji)l3 (ai,wir)/3 
(ai,wir)(a 2 ,W 2 ) . ■ . {aj,ujj){bk,ujj+i){bk-i,e) .. . ( 6 i,e) 

where /3 is a sentential form in G. By the same argument as in the previous case 
we also have the following derivation: 

((9i,-4,(74,B),e) ((gi,a:,g3),Wo)/? ^ 

(((?i,x,g 2 ), wo?'i)(ai,e)/3 ^ (oq, wo?’i?" 2 )(ai, e)/3 
(oo, worir 2 )(ai,e)(a 2 ,W 2 ) . . . {aj,ujj){bk,ujj+i){bk-i,e) . . . ( 6 i,e) 

Furthermore, it is easy to verify that r 2 r\U)Q is recognized by M' . 

The other cases in which the first transition involves an operation on the 
second component are rather easy compared to the previous case. 

“L(G) C L(M)”. Somewhat more generally, the following assertion can be 
shown to hold by induction on the length n of the derivation. From this it can 
be concluded straightforwardly, as we have seen above, that for each deriva- 
tion yielding only terminal symbols there is a computation in M accepting the 
corresponding terminal string. 

(ai,uji) . . . {aj,uij){{q2,G,q3,G),rir2 . . .rk){bk,e ) . . . ( 6 i,e) 

and 

{q3,rk...ri,ci)h* ( 94 , £, 02 ) 
and 

G L{M) for 1 < i < j 



{qi,ai . . . Uj, (co, A)) J^* {q 2 , e, (cq, Ca)) 
and 

(g 3 , bk... bi, (ci, Ca)) J^* ( 94 , e, (c 2 , B)) 



implies 
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In case n = 0 we have qi = q 2 , qs = q 4 , A = B = C and fc = 0 and the assertion 
is trivially true. 

The interesting step for the induction is a derivation starting with a produc- 
tion introduced by (3) of the construction. The derivation will now be of the 
following form: 



{{qi,A,qr,A),e) ^ (1) 

{{qi,ak,q3),e){{q3,C,q4,e),€){{q4,A,q'r,A),ro) ( 2 ) 

(ao,Wo) .. . {ak,e){{q3,C,q4,e),e){{q4,A,q7,A),ro) (3) 

(oo.wo) . . . (afc,e)(afc+i,Wfe+i) . . . (a;, W;)((g4, A, gy, A), tq) (4) 



(oo.wo) . . . (ai,uji) . . . {am,uJm){{q 5 ,D,qe,D),rori . . .r„)(6„,e) . . . (6i,e) 



By induction and by the construction we can conclude, 



(gi,oo ...am, (co, A)) 
{q2,ak . . . (co,-4)) ^ 

(<Z3j ak+1 . . . {c3,CA))\f 

{q4,ai...am,{co,A))\f 
{q 5 ,e, (co,D(A)) 



from (2) 

by the construction from (1) 
by induction from (3) 
by induction from (4) 



And 

{qe,bn. ..bi, (co, D(A)) by induction from (3) 

( 97 , e, (co, A)) 

Note that the first induction step does not follow from the induction hypothesis 
as formulated here. But is easy to verify this assertion in order to complete the 
proof. □ 



Theorem 3. For each concatenating storage type S the following holds. 

Spc^) = L(©erld/£(^)^) □ 

According to this theorem, automata using sequences of pushdowns formed by 
concatenation w.r.t. writing generate languages of the hierarchy that is defined 
as follows: £0 = £reg and £i = T(©ERLD/£|ii)- Because L(©erld/£)^ = 
£(©elld/£) C L(©ld/£), the control language is each time a mildly context 
sensitive language and consequently the language generated by the controlled 
grammar is as well. Finally, we can conclude that CFL-S'-Gs with S a sequences 
of pushdowns formed by concatenation w.r.t. writing generate only mildly con- 
text sensitive languages. 



6 Conclusion 

Picking up from current linguistic theories the idea that a grammar should differ- 
entiate between various types of dependencies, I have defined two possibilities to 
extend LIGs. Both ways formalize the idea of using a number of index pushdowns 
with a restricted accessibility. This induced two hierarchies of languages, which 
are presented schematically in Table Q The upper part displays the hierarchy 
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Table 1. Overview of Concatenated Storages 



50 — 5'triv 

51 — — Si — 1 •S'pd 



*5*0 — *S*triv 



*5*1 — *5*triv *5pd — 
*5*2 ~ *5*pd Or *5pd 



*5pd 



= £reg 

= i(©ELLD/C^i-l) 

©^0 = £reg 

©^1 = i(©BLLD/£reg) = £cF 

©^2 = i(©BLLD/£cF) = -Cell: 



S 3 — (*^pd ‘S'pd) Or- 5*pd 



50 — tS'triv 

51 = Si — 1 Onj Spd 



©“”0 = £reg 

©®*i = i(©ERLD/(©®^i — 1)^) 



5*0 — 5triv 

51 = Striv Spd = Spd 

5 2 — 5pd ^tn Spd 

5 3 — (5pd Spd'} Spd 



©®0 = £reg 

©^’l = 5(©BRLa/(£reg)^) 

= I/(©BRLD/£reg) = £cF 

©™2 = i(0ERLD/(£cF)^) 

= Z/(0brld/£cf) = £erli 

©^*3 = ^(©BRLd/(£bRLi)^) 

= I/(©brld/£blli) 



that was established by P, using a restriction on reading. On the left hand side 
we have the new definition of the storage types that were used in the original 
definition. On the right hand side we have the representation using controlled 
languages according to Theorem El The lower part of the table presents the new 
hierarchy. Interestingly, the second member of this hierarchy is a restricted class 
extended right linear indexed languages, denoted by £erlIj that we have defined 
in p. There we argue that the candidates for inheriting the stack of indices 
can be restricted while all possibilities that unrestricted LIGs provide for the 
description of natural languages are maintained. The classes of both hierarchies 
are subclasses of the classes of a hierarchy of mildly context sensitive languages 
defined by m- Thus, their generative capacity seems appropriate for natural 
languages. I have, however, argued that the languages and the storage types of 
the latter hierarchy are more adequate for the description of natural languages 
on empirical as well as on language theoretical grounds. 
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